Distributed automated planning and execution platform for designing and running complex processes

ABSTRACT

A distributed automated planning and execution platform for designing, instantiating, and running complex and evolving processes is provided, comprising an automated planning service that receives an automated planning job, constructs a simulation using known information about the world, system resources, and other contextual data, assigns individual work tasks to worker nodes for processing, analyzes the results and assigns uncertainty values as needed, and produces an action plan as output.

CROSS-REFERENCE TO RELATED APPLICATIONS

Application No. Date Filed Title Current Herewith DISTRIBUTED AUTOMATEDPLANNING application AND EXECUTION PLATFORM FOR DESIGNING AND RUNNINGCOMPLEX PROCESSES Is a continuation-in-part of: 17/061,195 Oct. 1, 2020CROWDSOURCED INNOVATION LABORATORY AND PROCESS IMPLENTATION SYSTEM whichis a continuation-in-part of: 17/035,029 Sep. 28, 2020 SYSTEM AND METHODFOR CREATION AND IMPLEMENTATION OF DATA PROCESSING WORKFLOWS USING ADISTRIBUTED COMPUTATIONAL GRAPH which is a continuation-in-part of:17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OF ENTERPRISE COMPUTERNETWORK ENVIRONMENTS which is a continuation-in-part of: 17/000,504 Aug.24, 2020 ADVANCED DETECTION OF IDENTITY- BASED ATTACKS TO ASSUREIDENTITY FIDELITY IN INFORMATION TECHNOLOGY ENVIRONMENTS which is acontinuation-in-part of: 16/855,724 Apr. 22, 2020 ADVANCED CYBERSECURITYTHREAT MITIGATION USING SOFTWARE SUPPLY CHAIN ANALYSIS which is acontinuation-in-part of: 16/836,717 Mar. 31, 2020 HOLISTIC COMPUTERSYSTEM CYBERSECURITY EVALUATION AND SCORING which is acontinuation-in-part of: 15/887,496 Feb. 2, 2018 SYSTEM AND METHODS FORPatent Issue Date SANDBOXED MALWARE ANALYSIS AND 10,783,241 Sep. 22,2020 AUTOMATED PATCH DEVELOPMENT, DEPLOYMENT AND VALIDATION which is acontinuation-in-part of: 15/818,733 Nov. 20, 2017 SYSTEM AND METHOD FORPatent Issue Date CYBERSECURITY ANALYSIS AND SCORE 10,673,887 Jun. 2,2020 GENERATION FOR INSURANCE PURPOSES which is a continuation-in-partof: 15/725,274 Oct. 4, 2017 APPLICATION OF ADVANCED Patent Issue DateCYBERSECURITY THREAT MITIGATION 10,609,079 Mar. 31, 2020 TO ROGUEDEVICES, PRIVILEGE ESCALATION, AND RISK-BASED VULNERABILITY AND PATCHMANAGEMENT which is a continuation-in-part of: 15/655,113 Jul. 20, 2017ADVANCED CYBERSECURITY THREAT Patent Issue Date MITIGATION USINGBEHAVIORAL AND 10,735,456 Aug. 4, 2020 DEEP ANALYTICS which is acontinuation-in-part of: 15/616,427 Jun. 7, 2017 RAPID PREDICTIVEANALYSIS OF VERY LARGE DATA SETS USING AN ACTOR- DRIVEN DISTRIBUTEDCOMPUTATIONAL GRAPH and is also a continuation-in-part of: 15/237,625Aug. 15, 2016 DETECTION MITIGATION AND Patent Issue Date REMEDIATION OFCYBERATTACKS 10,248,910 Apr. 2, 2019 EMPLOYING AN ADVANCED CYBER-DECISION PLATFORM which is a continuation-in-part of: 15/206,195 Jul. 8,2016 ACCURATE AND DETAILED MODELING OF SYSTEMS WITH LARGE COMPLEXDATASETS USING A DISTRIBUTED SIMULATION ENGINE which is acontinuation-in-part of: 15/186,453 Jun. 18, 2016 SYSTEM FOR AUTOMATEDCAPTURE AND ANALYSIS OF BUSINESS INFORMATION FOR RELIABLE BUSINESSVENTURE OUTCOME PREDICTION which is a continuation-in-part of:15/166,158 May 26, 2016 SYSTEM FOR AUTOMATED CAPTURE AND ANALYSIS OFBUSINESS INFORMATION FOR SECURITY AND CLIENT-FACING INFRASTRUCTURERELIABILITY which is a continuation-in-part of: 15/141,752 Apr. 28, 2016SYSTEM FOR FULLY INTEGRATED CAPTURE, AND ANALYSIS OF BUSINESSINFORMATION RESULTING IN PREDICTIVE DECISION MAKING AND SIMULATION whichis a continuation-in-part of: 15/091,563 Apr. 5, 2016 SYSTEM FORCAPTURE, ANALYSIS AND Patent Issue Date STORAGE OF TIME SERIES DATA FROM10,204,147 Feb. 12, 2019 SENSORS WITH HETEROGENEOUS REPORT INTERVALPROFILES and is also a continuation-in-part of: 14/986,536 Dec. 31, 2015DISTRIBUTED SYSTEM FOR LARGE Patent Issue Date VOLUME DEEP WEB DATA10,210,255 Feb. 19, 2019 EXTRACTION and is also a continuation-in-partof: 14/925,974 Oct. 28, 2015 RAPID PREDICTIVE ANALYSIS OF VERY LARGEDATA SETS USING THE DISTRIBUTED COMPUTATIONAL GRAPH Current HerewithDISTRIBUTED AUTOMATED PLANNING application AND EXECUTION PLATFORM FORDESIGNING AND RUNNING COMPLEX PROCESSES Is a continuation-in-part of:17/061,195 Oct. 1, 2020 CROWDSOURCED INNOVATION LABORATORY AND PROCESSIMPLENTATION SYSTEM which is a continuation-in-part of: 17/035,029 Sep.28, 2020 SYSTEM AND METHOD FOR CREATION AND IMPLEMENTATION OF DATAPROCESSING WORKFLOWS USING A DISTRIBUTED COMPUTATIONAL GRAPH which is acontinuation-in-part of: 17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OFENTERPRISE COMPUTER NETWORK ENVIRONMENTS which is a continuation-in-partof: 17/000,504 Aug. 24, 2020 ADVANCED CYBERSECURITY THREAT MITIGATIONUSING SOFTWARE SUPPLY CHAIN ANALYSIS which is a continuation-in-part of:16/855,724 Apr. 22, 2020 ADVANCED CYBERSECURITY THREAT MITIGATION USINGSOFTWARE SUPPLY CHAIN ANALYSIS which is a continuation-in-part of:16/836,717 Mar. 31, 2020 HOLISTIC COMPUTER SYSTEM CYBERSECURITYEVALUATION AND SCORING which is a continuation-in-part of: 15/887,496Feb. 2, 2018 SYSTEM AND METHODS FOR Patent Issue Date SANDBOXED MALWAREANALYSIS AND 10,783,241 Sep. 22, 2020 AUTOMATED PATCH DEVELOPMENT,DEPLOYMENT AND VALIDATION which is a continuation-in-part of: 15/823,285Nov. 27, 2017 META-INDEXING, SEARCH, Patent Issue Date COMPLIANCE, ANDTEST FRAMEWORK 10,740,096 Aug. 11, 2020 FOR SOFTWARE DEVELOPMENT whichis a continuation-in-part of: 15/788,718 Oct. 19, 2017 DATA MONETIZATIONAND EXCHANGE PLATFORM which claims priority, and benefit to: 62/568,307Oct. 4, 2017 DATA MONETIZATION AND EXCHANGE PLATFORM and is also acontinuation-in-part of: 15/788,002 Oct. 19, 2017 ALGORITHM MONETIZATIONAND EXCHANGE PLATFORM which claims priority, and benefit to: 62/568,305Oct. 4, 2017 ALGORITHM MONETIZATION AND EXCHANGE PLATFORM and is also acontinuation-in-part of: 15/787,601 Oct. 18, 2017 METHOD AND APPARATUSFOR CROWDSOURCED DATA GATHERING, EXTRACTION, AND COMPENSATION whichclaims priority, and benefit to: 62/568,312 Oct. 4, 2017 METHOD ANDAPPARATUS FOR CROWDSOURCED DATA GATHERING, EXTRACTION, AND COMPENSATIONand is also a continuation-in-part of: 15/616,427 Jun. 7, 2017 RAPIDPREDICTIVE ANALYSIS OF VERY LARGE DATA SETS USING AN ACTOR- DRIVENDISTRIBUTED COMPUTATIONAL GRAPH which is a continuation-in-part of:14/925,974 Oct. 28, 2015 RAPID PREDICTIVE ANALYSIS OF VERY LARGE DATASETS USING THE DISTRIBUTED COMPUTATIONAL GRAPHY Current HerewithDISTRIBUTED AUTOMATED PLANNING application AND EXECUTION PLATFORM FORDESIGNING AND RUNNING COMPLEX PROCESSES Is a continuation-in-part of:17/061,195 Oct. 1, 2020 CROWDSOURCED INNOVATION LABORATORY AND PROCESSIMPLENTATION SYSTEM which is a continuation-in-part of: 17/035,029 Sep.28, 2020 SYSTEM AND METHOD FOR CREATION AND IMPLEMENTATION OF DATAPROCESSING WORKFLOWS USING A DISTRIBUTED COMPUTATIONAL GRAPH which is acontinuation-in-part of: 17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OFENTERPRISE COMPUTER NETWORK ENVIRONMENTS which is a continuation-in-partof: 17/000,504 Aug. 24, 2020 ADVANCED DETECTION OF IDENTITY- BASEDATTACKS TO ASSURE IDENTITY FIDELITY IN INFORMATION TECHNOLOGYENVIRONMENTS which is a continuation-in-part of: 16/855,724 Apr. 22,2020 ADVANCED CYBERSECURITY THREAT MITIGATION USING SOFTWARE SUPPLYCHAIN ANALYSIS which is a continuation-in-part of: 16/777,270 Jan. 30,2020 CYBERSECURITY PROFILING AND RATING USING ACTIVE AND PASSIVEEXTERNAL RECONNAISSANCE which is a continuation-in-part of: 16/720,383Dec. 19, 2019 RATING ORGANIZATION CYBERSECURITY USING ACTIVE AND PASSIVEEXTERNAL RECONNAISSANCE which is a continuation of: 15/823,363 Nov. 27,2017 RATING ORGANIZATION Patent Issue Date CYBERSECURITY USING ACTIVEAND 10,560,483 Feb. 11, 2020 PASSIVE EXTERNAL RECONNAISSANCE which is acontinuation-in-part of: 15/725,274 Oct. 4, 2017 APPLICATION OF ADVANCEDPatent Issue Date CYBERSECURITY THREAT MITIGATION 10,609,079 Mar. 31,2020 TO ROGUE DEVICES, PRIVILEGE ESCALATION, AND RISK-BASEDVULNERABILITY AND PATCH MANAGEMENT Current Herewith DISTRIBUTEDAUTOMATED PLANNING application AND EXECUTION PLATFORM FOR DESIGNING ANDRUNNING COMPLEX PROCESSES Is a continuation-in-part of: 17/061,195 Oct.1, 2020 CROWDSOURCED INNOVATION LABORATORY AND PROCESS IMPLENTATIONSYSTEM which is a continuation-in-part of: 17/035,029 Sep. 28, 2020SYSTEM AND METHOD FOR CREATION AND IMPLEMENTATION OF DATA PROCESSINGWORKFLOWS USING A DISTRIBUTED COMPUTATIONAL GRAPH which is acontinuation-in-part of: 17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OFENTERPRISE COMPUTER NETWORK ENVIRONMENTS which is a continuation-in-partof: 17/000,504 Aug. 24, 2020 ADVANCED DETECTION OF IDENTITY- BASEDATTACKS TO ASSURE IDENTITY FIDELITY IN INFORMATION TECHNOLOGYENVIRONMENTS which is a continuation-in-part of: 16/412,340 May 14, 2019SECURE POLICY-CONTROLLED PROCESSING AND AUDITING ON REGULATED DATA SETSwhich is a continuation-in-part of: 16/267,893 Feb. 5, 2019 SYSTEM ANDMETHODS FOR DETECTING AND CHARACTERIZING ELECTROMAGNETIC EMISSIONS whichis a continuation-in-part of: 16/248,133 Jan. 15, 2019 SYSTEM AND METHODFOR MULTI- MODEL GENERATIVE SIMULATION MODELING OF COMPLEX ADAPTIVESYSTEMS which is a continuation-in-part of: 15/813,097 Nov. 14, 2017EPISTEMIC UNCERTAINTY REDUCTION USING SIMULATIONS, MODELS AND DATAEXCHANGE which is a continuation-in-part of: 15/616,427 Jun. 7, 2017RAPID PREDICTIVE ANALYSIS OF VERY LARGE DATA SETS USING AN ACTOR- DRIVENDISTRIBUTED COMPUTATIONAL GRAPH Current Herewith DISTRIBUTED AUTOMATEDPLANNING application AND EXECUTION PLATFORM FOR DESIGNING AND RUNNINGCOMPLEX PROCESSES Is a continuation-in-part of: 17/061,195 Oct. 1, 2020CROWDSOURCED INNOVATION LABORATORY AND PROCESS IMPLENTATION SYSTEM whichis a continuation-in-part of: 17/035,029 Sep. 28, 2020 SYSTEM AND METHODFOR CREATION AND IMPLEMENTATION OF DATA PROCESSING WORKFLOWS USING ADISTRIBUTED COMPUTATIONAL GRAPH which is a continuation-in-part of:17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OF ENTERPRISE COMPUTERNETWORK ENVIRONMENTS which is a continuation-in-part of: 17/000,504 Aug.24, 2020 ADVANCED DETECTION OF IDENTITY- BASED ATTACKS TO ASSUREIDENTITY FIDELITY IN INFORMATION TECHNOLOGY ENVIRONMENTS which is acontinuation-in-part of: 16/412,340 May 14, 2019 SECUREPOLICY-CONTROLLED PROCESSING AND AUDITING ON REGULATED DATA SETS whichis a continuation-in-part of: 16/267,893 Feb. 5, 2019 SYSTEM AND METHODSFOR DETECTING AND CHARACTERIZING ELECTROMAGNETIC EMISSIONS which is acontinuation-in-part of: 16/248,133 Jan. 15, 2019 SYSTEM AND METHOD FORMULTI- MODEL GENERATIVE SIMULATION MODELING OF COMPLEX ADAPTIVE SYSTEMSwhich is also a continuation-in-part of: 15/806,697 Nov. 8, 2017MODELING MULTI-PERIL CATASTROPHE USING A DISTRIBUTED SIMULATION ENGINEwhich is a continuation-in-part of: 15/376,657 Dec. 13, 2016QUANTIFICATION FOR INVESTMENT Patent Issue Date VEHICLE MANAGEMENTEMPLOYING 10,402,906 Sep. 3, 2019 AN ADVANCED DECISION PLATFORM which isa continuation-in-part of: 15/237,625 Aug. 15, 2016 DETECTION MITIGATIONAND Patent Issue Date REMEDIATION OF CYBERATTACKS 10,248,910 Apr. 2,2019 EMPLOYING AN ADVANCED CYBER- DECISION PLATFORM Current HerewithDISTRIBUTED AUTOMATED PLANNING application AND EXECUTION PLATFORM FORDESIGNING AND RUNNING COMPLEX PROCESSES Is a continuation-in-part of:17/061,195 Oct. 1, 2020 CROWDSOURCED INNOVATION LABORATORY AND PROCESSIMPLENTATION SYSTEM which is a continuation-in-part of: 17/035,029 Sep.28, 2020 SYSTEM AND METHOD FOR CREATION AND IMPLEMENTATION OF DATAPROCESSING WORKFLOWS USING A DISTRIBUTED COMPUTATIONAL GRAPH which is acontinuation-in-part of 17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OFENTERPRISE COMPUTER NETWORK ENVIRONMENTS which is a continuation-in-partof: 17/000,504 Aug. 24, 2020 ADVANCED DETECTION OF IDENTITY- BASEDATTACKS TO ASSURE IDENTITY FIDELITY IN INFORMATION TECHNOLOGYENVIRONMENTS which is a continuation-in-part of: 16/412,340 May 14, 2019SECURE POLICY-CONTROLLED PROCESSING AND AUDITING ON REGULATED DATA SETSwhich is a continuation-in-part of: 16/267,893 Feb. 5, 2019 SYSTEM ANDMETHODS FOR DETECTING AND CHARACTERIZING ELECTROMAGNETIC EMISSIONS whichis a continuation-in-part of: 16/248,133 Jan. 15, 2019 SYSTEM AND METHODFOR MULTI- MODEL GENERATIVE SIMULATION MODELING OF COMPLEX ADAPTIVESYSTEMS which is a continuation-in-part of: 15/806,697 Nov. 8, 2017MODELING MULTI-PERIL CATASTROPHE USING A DISTRIBUTED SIMULATION ENGINEwhich is a continuation-in-part of: 15/343,209 Nov. 4, 2016 RISKQUANTIFICATION FOR INSURANCE PROCESS MANAGEMENT EMPLOYING AN ADVANCEDDECISION PLATFORM which is a continuation-in-part of: 15/237,625 Aug.15, 2016 DETECTION MITIGATION AND Patent Issue Date REMEDIATION OFCYBERATTACKS 10,248,910 Apr. 2, 2019 EMPLOYING AN ADVANCED CYBER-DECISION PLATFORM and is also a continuation-in-part of: 15/229,476 Aug.5, 2016 HIGHLY SCALABLE DISTRIBUTED Patent Issue Date CONNECTIONINTERFACE FOR DATA 10,454,791 Oct. 22, 2019 CAPTURE FROM MULTIPLENETWORK SERVICE SOURCES which is a continuation-in-part of: 15/206,195Jul. 8, 2016 ACCURATE AND DETAILED MODELING OF SYSTEMS WITH LARGECOMPLEX DATASETS USING A DISTRIBUTED SIMULATION ENGINE Current HerewithDISTRIBUTED AUTOMATED PLANNING application AND EXECUTION PLATFORM FORDESIGNING AND RUNNING COMPLEX PROCESSES Is a continuation-in-part of:17/061,195 Oct. 1, 2020 CROWDSOURCED INNOVATION LABORATORY AND PROCESSIMPLENTATION SYSTEM which is a continuation-in-part of: 17/035,029 Sep.28, 2020 SYSTEM AND METHOD FOR CREATION AND IMPLEMENTATION OF DATAPROCESSING WORKFLOWS USING A DISTRIBUTED COMPUTATIONAL GRAPH which is acontinuation-in-part of: 17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OFENTERPRISE COMPUTER NETWORK ENVIRONMENTS which is a continuation-in-partof: 17/000,504 Aug. 24, 2020 ADVANCED DETECTION OF IDENTITY- BASEDATTACKS TO ASSURE IDENTITY FIDELITY IN INFORMATION TECHNOLOGYENVIRONMENTS which is a continuation-in-part of: 16/412,340 May 14, 2019SECURE POLICY-CONTROLLED PROCESSING AND AUDITING ON REGULATED DATA SETSwhich is a continuation-in-part of: 16/267,893 Feb. 5, 2019 SYSTEM ANDMETHODS FOR DETECTING AND CHARACTERIZING ELECTROMAGNETIC EMISSIONS whichis a continuation-in-part of: 16/248,133 Jan. 15, 2019 SYSTEM AND METHODFOR MULTI- MODEL GENERATIVE SIMULATION MODELING OF COMPLEX ADAPTIVESYSTEMS which is a continuation-in-part of: 15/673,368 Aug. 9, 2017AUTOMATED SELECTION AND PROCESSING OF FINANCIAL MODELS which is acontinuation-in-part of: 15/376,657 Dec. 13, 2016 QUANTIFICATION FORINVESTMENT Patent Issue Date VEHICLE MANAGEMENT EMPLOYING 10,402,906Sep. 3, 2019 AN ADVANCED DECISION PLATFORM Current Herewith DISTRIBUTEDAUTOMATED PLANNING application AND EXECUTION PLATFORM FOR DESIGNING ANDRUNNING COMPLEX PROCESSES Is a continuation-in-part of: 17/061,195 Oct.1, 2020 CROWDSOURCED INNOVATION LABORATORY AND PROCESS IMPLENTATIONSYSTEM which is a continuation-in-part of: 17/035,029 Sep. 28, 2020SYSTEM AND METHOD FOR CREATION AND IMPLEMENTATION OF DATA PROCESSINGWORKFLOWS USING A DISTRIBUTED COMPUTATIONAL GRAPH which is acontinuation-in-part of: 17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OFENTERPRISE COMPUTER NETWORK ENVIRONMENTS which is a continuation-in-partof: 17/000,504 Aug. 24, 2020 ADVANCED DETECTION OF IDENTITY- BASEDATTACKS TO ASSURE IDENTITY FIDELITY IN INFORMATION TECHNOLOGYENVIRONMENTS which is a continuation-in-part of: 16/412,340 May 14, 2019SECURE POLICY-CONTROLLED PROCESSING AND AUDITING ON REGULATED DATA SETSwhich is a continuation-in-part of: 16/267,893 Feb. 5, 2019 SYSTEM ANDMETHODS FOR DETECTING AND CHARACTERIZING ELECTROMAGNETIC EMISSIONS whichis a continuation-in-part of: 16/248,133 Jan. 15, 2019 SYSTEM AND METHODFOR MULTI- MODEL GENERATIVE SIMULATION MODELING OF COMPLEX ADAPTIVESYSTEMS which is a continuation-in-part of: 15/849,901 Dec. 21, 2017SYSTEM AND METHOD FOR OPTIMIZATION AND LOAD BALANCING OF COMPUTERCLUSTERS which is a continuation-in-part of: 15/835,312 Dec. 7, 2017SYSTEM AND METHODS FOR MULTI- LANGUAGE ABSTRACT MODEL CREATION FORDIGITAL ENVIRONMENT SIMULATIONS which is a continuation-in-part of:15/186,453 Jun. 18, 2016 SYSTEM FOR AUTOMATED CAPTURE AND ANALYSIS OFBUSINESS INFORMATION FOR RELIABLE BUSINESS VENTURE OUTCOME PREDICTIONCurrent Herewith DISTRIBUTED AUTOMATED PLANNING application ANDEXECUTION PLATFORM FOR DESIGNING AND RUNNING COMPLEX PROCESSES Is acontinuation-in-part of: 17/061,195 Oct. 1, 2020 CROWDSOURCED INNOVATIONLABORATORY AND PROCESS IMPLENTATION SYSTEM which is acontinuation-in-part of: 17/035,029 Sep. 28, 2020 SYSTEM AND METHOD FORCREATION AND IMPLEMENTATION OF DATA PROCESSING WORKFLOWS USING ADISTRIBUTED COMPUTATIONAL GRAPH which is a continuation-in-part of:17/008,276 Aug. 31, 2020 PRIVILEGE ASSURANCE OF ENTERPRISE COMPUTERNETWORK ENVIRONMENTS which is a continuation-in-part of: 17/000,504 Aug.24, 2020 ADVANCED DETECTION OF IDENTITY- BASED ATTACKS TO ASSUREIDENTITY FIDELITY IN INFORMATION TECHNOLOGY ENVIRONMENTS which is acontinuation-in-part of: 16/412,340 May 14, 2019 SECUREPOLICY-CONTROLLED PROCESSING AND AUDITING ON REGULATED DATA SETS whichis a continuation-in-part of: 16/267,893 Feb. 5, 2019 SYSTEM AND METHODSFOR DETECTING AND CHARACTERIZING ELECTROMAGNETIC EMISSIONS which is acontinuation-in-part of: 16/248,133 Jan. 15, 2019 SYSTEM AND METHOD FORMULTI- MODEL GENERATIVE SIMULATION MODELING OF COMPLEX ADAPTIVE SYSTEMSwhich is a continuation-in-part of: 15/849,901 Dec. 21, 2017 SYSTEM ANDMETHOD FOR OPTIMIZATION AND LOAD BALANCING OF COMPUTER CLUSTERS which isa continuation-in-part of: 15/835,436 Dec. 7, 2017 TRANSFER LEARNING ANDDOMAIN Patent Issue Date ADAPTATION USING DISTRIBUTABLE 10,572,828 Feb.25, 2020 DATA MODELS which is a continuation-in-part of: 15/790,457 Oct.23, 2017 DISTRIBUTABLE MODEL WITH BIASES CONTAINED WITHIN DISTRIBUTEDDATA which claims benefit of, and priority to: 62/568,298 Oct. 4, 2017DISTRIBUTABLE MODEL WITH BIASES CONTAINED IN DISTRIBUTED DATA and isalso a continuation-in-part of: 15/790,327 Oct. 23, 2017 DISTRIBUTABLEMODEL WITH DISTRIBUTED DATA which claims benefit of, and priority to:62/568,291 Oct. 4, 2017 DISTRIBUTABLE MODEL WITH DISTRIBUTED DATA and isalso a continuation-in-part of: 15/616,427 Jun. 7, 2017 RAPID PREDICTIVEANALYSIS OF VERY LARGE DATA SETS USING AN ACTOR- DRIVEN DISTRIBUTEDCOMPUTATIONAL GRAPH and is also a continuation-in-part of: 15/141,752Apr. 28, 2016 SYSTEM FOR FULLY INTEGRATED CAPTURE, AND ANALYSIS OFBUSINESS INFORMATION RESULTING IN PREDICTIVE DECISION MAKING ANDSIMULATION Current Herewith DISTRIBUTED AUTOMATED PLANNING applicationAND EXECUTION PLATFORM FOR DESIGNING AND RUNNING COMPLEX PROCESSES Is acontinuation-in-part of: 17/061,195 Oct. 1, 2020 CROWDSOURCED INNOVATIONLABORATORY AND PROCESS IMPLENTATION SYSTEM which is acontinuation-in-part of: 15/879,801 Jan. 25, 2018 PLATFORM FORMANAGEMENT AND TRACKING OF COLLABORATIVE PROJECTS which is acontinuation-in-part of: 15/379,899 Dec. 15, 2016 INCLUSION OF TIMESERIES GEOSPATIAL MARKERS IN ANALYSES EMPLOYING AN ADVANCED CYBER-DECISION PLATFORM which is a continuation-in-part of: 15/376,657 Dec.13, 2016 QUANTIFICATION FOR INVESTMENT Patent Issue Date VEHICLEMANAGEMENT EMPLOYING N 10,402,906 Sep. 3, 2019 ADVANCED DECISIONPLATFORM which is a continuation-in-part of: 15/237,625 Aug. 15, 2016DETECTION MITIGATION AND Patent Issue Date REMEDIATION OF CYBERATTACKS10,248,910 Apr. 2, 2019 EMPLOYING AN ADVANCED CYBER- DECISION PLATFORMwhich is a continuation-in-part of: 15/206,195 Jul. 8, 2016 ACCURATE ANDDETAILED MODELING OF SYSTEMS WITH LARGE COMPLEX DATASETS USING ADISTRIBUTED SIMULATION ENGINE which is a continuation-in-part of:15/186,453 Jun. 18, 2016 SYSTEM FOR AUTOMATED CAPTURE AND ANALYSIS OFBUSINESS INFORMATION FOR RELIABLE BUSINESS VENTURE OUTCOME PREDICTIONwhich is a continuation-in-part of: 15/166,158 May 26, 2016 SYSTEM FORAUTOMATED CAPTURE AND ANALYSIS OF BUSINESS INFORMATION FOR SECURITY ANDCLIENT-FACING INFRASTRUCTURE RELIABILITY which is a continuation-in-partof: 15/141,752 Apr. 28, 2016 SYSTEM FOR FULLY INTEGRATED CAPTURE, ANDANALYSIS OF BUSINESS INFORMATION RESULTING IN PREDICTIVE DECISION MAKINGAND SIMULATION which is a continuation-in-part of: 15/091,563 Apr. 5,2016 SYSTEM FOR CAPTURE, ANALYSIS AND Patent Issue Date STORAGE OF TIMESERIES DATA FROM 10,204,147 Feb. 12, 2019 SENSORS WITH HETEROGENEOUSREPORT INTERVAL PROFILES and is also a continuation-in-part of:14/986,536 Dec. 31, 2015 DISTRIBUTED SYSTEM FOR LARGE Patent Issue DateVOLUME DEEP WEB DATA 10,210,255 Feb. 19, 2019 EXTRACTION and is also acontinuation-in-part of: 14/925,974 Oct. 28, 2015 RAPID PREDICTIVEANALYSIS OF VERY LARGE DATA SETS USING THE DISTRIBUTED COMPUTATIONALGRAPH Current Herewith DISTRIBUTED AUTOMATED PLANNING application ANDEXECUTION PLATFORM FOR DESIGNING AND RUNNING COMPLEX PROCESSES Is acontinuation-in-part of: 16/709,598 Dec. 10, 2019 RAPID PREDICTIVEANALYSIS OF VERY LARGE DATA SETS USING THE DISTRIBUTED COMPUTATIONALGRAPH USING CONFIGURABLE ARRANGEMENT OF PROCESSING COMPONENTS which is acontinuation-in-part of: 14/925,974 Oct. 28, 2015 RAPID PREDICTIVEANALYSIS OF VERY LARGE DATA SETS USING THE DISTRIBUTED COMPUTATIONALGRAPH the entire specification of each of which is incorporated hereinby reference.the entire specification of each of which is incorporated herein byreference.

BACKGROUND OF THE INVENTION Field of the Invention

The disclosure relates to the field of automated tracking and managementof collaborative projects.

Discussion of the State of the Art

Automated planning is a branch of machine learning that focuses oncomputationally creating ordered sets of actions to perform a giventask. Artificial intelligence has been used in this way for decades toplan robotic activity, controlling unmanned vehicles, and to plan mannedoperations such as space missions. However, current systems do not scalewell when the number of objects in a given problem space is large.Current systems attempt to deal with this limitation usingdomain-independent heuristics, however this is an inelegant solutionthat results in large search trees that cannot be processed in areasonable time and thus imposing a de facto limit on the size of theproblem a given system is capable of handling.

What is needed is a system that will allow users to design, instantiate,and run complex and evolving processes using a distributed, scalablesystem that can handle increasingly-complex problems through adistributed architecture and distributed computational graph-based datatransformation pipelines.

SUMMARY OF THE INVENTION

Accordingly, the inventor has developed and reduced to practice, adistributed automated planning and execution platform for designing,instantiating, and running complex and evolving processes. In a typicalembodiment, a platform may be deployed in a distributed or federatedarchitecture, without the need for strict synchronization across systemcomponents and instead relying on an “eventual agreement” model whereinconsistency is achieved in an asynchronous, yet certain, manner. Thesystem may also employ various machine learning techniques andsimulations to continually improve and evolve through the use of datatransformation pipelines, ensuring the system can scale as needed tohandle increasingly complex processes.

According to a preferred embodiment, a system for management andtracking of collaborative projects, comprising: an automated planningservice comprising a memory, a processor, and a plurality of programminginstructions stored in the memory thereof and operable on the processorthereof, wherein the programmable instructions, when operating on theprocessor, cause the processor to: operate a plurality of master nodes,each master node in turn operating a plurality of worker nodes; receivea plurality of simulation conditions at a master node; construct aplurality of simulation components based on the received simulationconditions; construct a planning model based on the received simulationconditions; assign, using a master node, a plurality of discretesimulation tasks to a plurality of worker nodes, each of the pluralityof discrete simulation tasks being based on the constructed simulationcomponents and planning model, wherein each of the plurality of workernodes is assigned exactly one of the plurality of discrete simulationtasks at any given time during operation; analyze results of each of theplurality of discrete simulation tasks as they are completed; andprovide the analyzed results as output, is disclosed.

According to another preferred embodiment, a method for management andtracking of collaborative projects, comprising the steps of: operating,at an automated planning service, a plurality of master nodes, eachmaster node in turn operating a plurality of worker nodes; receiving aplurality of simulation conditions at a master node; constructing aplurality of simulation components based on the received simulationconditions; constructing a planning model based on the receivedsimulation conditions; assigning, using a master node, a plurality ofdiscrete simulation tasks to a plurality of worker nodes, each of theplurality of discrete simulation tasks being based on the constructedsimulation components and planning model, wherein each of the pluralityof worker nodes is assigned exactly one of the plurality of discretesimulation tasks at any given time during operation; analyzing resultsof each of the plurality of discrete simulation tasks as they arecompleted; and providing the analyzed results as output, is disclosed.

BRIEF DESCRIPTION OF THE DRAWING FIGURES

The accompanying drawings illustrate several aspects and, together withthe description, serve to explain the principles of the inventionaccording to the aspects. It will be appreciated by one skilled in theart that the particular arrangements illustrated in the drawings aremerely exemplary and are not to be considered as limiting of the scopeof the invention or the claims herein in any way.

FIG. 1 is a block diagram illustrating an exemplary hardwarearchitecture of a computing device used in various embodiments of theinvention.

FIG. 2 is a block diagram illustrating an exemplary logical architecturefor a client device, according to various embodiments of the invention.

FIG. 3 is a block diagram illustrating an exemplary architecturalarrangement of clients, servers, and external services, according tovarious embodiments of the invention.

FIG. 4 is a block diagram illustrating an exemplary overview of acomputer system as may be used in any of the various locationsthroughout the system

FIG. 5 is a diagram of an exemplary architecture for a system wherestreams of input data from one or more of a plurality of sources areanalyzed to predict outcome using both batch analysis of acquired dataand transformation pipeline manipulation of current streaming dataaccording to an embodiment of the invention.

FIG. 6 is a diagram of an exemplary architecture for a lineartransformation pipeline system which introduces the concept of thetransformation pipeline as a directed graph of transformation nodes andmessages according to an embodiment of the invention.

FIG. 7 is a diagram of an exemplary architecture for a transformationpipeline system where one of the transformations receives input frommore than one source which introduces the concept of the transformationpipeline as a directed graph of transformation nodes and messagesaccording to an embodiment of the invention.

FIG. 8 is a diagram of an exemplary architecture for a transformationpipeline system where the output of one data transformation servers asthe input of more than one downstream transformation which introducesthe concept of the transformation pipeline as a directed graph oftransformation nodes and messages according to an embodiment of theinvention.

FIG. 9 is a diagram of an exemplary architecture for a transformationpipeline system where a set of three data transformations act to form acyclical pipeline which also introduces the concept of thetransformation pipeline as a directed graph of transformation nodes andmessages according to an embodiment of the invention.

FIG. 10 is a process flow diagram of a method for the receipt,processing and predictive analysis of streaming data using a system ofthe invention.

FIG. 11 is a process flow diagram of a method for representing theoperation of the transformation pipeline as a directed graph functionusing a system of the invention.

FIG. 12 is a process flow diagram of a method for a linear datatransformation pipeline using a system of the invention.

FIG. 13 is a process flow diagram of a method for the disposition ofinput from two antecedent data transformations into a single datatransformation of transformation pipeline using a system of theinvention.

FIG. 14 is a process flow diagram of a method for the disposition ofoutput of one data transformation that then serves as input to twopostliminary data transformations using a system of the invention.

FIG. 15 is a process flow diagram of a method for processing a set ofthree or more data transformations within a data transformation pipelinewhere output of the last member transformation of the set serves asinput of the first member transformation thereby creating a cyclicalrelationship using a system of the invention.

FIG. 16 is a process flow diagram of a method for the receipt and use ofstreaming data into batch storage and analysis of changes over time,repetition of specific data sequences or the presence of critical datapoints using a system of the invention.

FIG. 17 is a diagram of a computing architecture for a processing systemaccording to one aspect of the present invention.

FIG. 18 is a diagram of a computing pipeline architecture for aprocessing system according to one aspect of the present invention.

FIG. 19 is a diagram of a computing operating states for a processingsystem according to one aspect of the present invention.

FIG. 20A-20D is a process flow diagram for a set of processingoperations used in a pipeline processing system according to one aspectof the present invention.

FIG. 21 is a system diagram detailing the components of a ProductionRule System (PRS), according to an embodiment.

FIG. 22 is a system diagram illustrating cyclic workflow stages in apipeline of data analysis, according to an embodiment.

FIG. 23 is a block diagram illustrating an exemplary system architecturefor automated planning, according to a preferred embodiment.

FIG. 24 is a flow diagram illustrating an exemplary overview of aprocess for automated planning, according to a preferred embodiment.

FIG. 25 is a flow diagram illustrating an exemplary process for asingle-run AP job.

FIG. 26 is a block diagram illustrating an exemplary process for amultiple-run AP job.

DETAILED DESCRIPTION

The inventor has conceived, and reduced to practice, a distributedautomated planning and execution platform for designing, instantiating,and running complex and evolving processes.

One or more different inventions may be described in the presentapplication. Further, for one or more of the inventions describedherein, numerous alternative embodiments may be described; it should beunderstood that these are presented for illustrative purposes only. Thedescribed embodiments are not intended to be limiting in any sense. Oneor more of the inventions may be widely applicable to numerousembodiments, as is readily apparent from the disclosure. In general,embodiments are described in sufficient detail to enable those skilledin the art to practice one or more of the inventions, and it is to beunderstood that other embodiments may be utilized and that structural,logical, software, electrical and other changes may be made withoutdeparting from the scope of the particular inventions. Accordingly,those skilled in the art will recognize that one or more of theinventions may be practiced with various modifications and alterations.Particular features of one or more of the inventions may be describedwith reference to one or more particular embodiments or figures thatform a part of the present disclosure, and in which are shown, by way ofillustration, specific embodiments of one or more of the inventions. Itshould be understood, however, that such features are not limited tousage in the one or more particular embodiments or figures withreference to which they are described. The present disclosure is neithera literal description of all embodiments of one or more of theinventions nor a listing of features of one or more of the inventionsthat must be present in all embodiments.

Headings of sections provided in this patent application and the titleof this patent application are for convenience only, and are not to betaken as limiting the disclosure in any way.

Devices that are in communication with each other need not be incontinuous communication with each other, unless expressly specifiedotherwise. In addition, devices that are in communication with eachother may communicate directly or indirectly through one or moreintermediaries, logical or physical.

A description of an embodiment with several components in communicationwith each other does not imply that all such components are required. Tothe contrary, a variety of optional components may be described toillustrate a wide variety of possible embodiments of one or more of theinventions and in order to more fully illustrate one or more aspects ofthe inventions. Similarly, although process steps, method steps,algorithms or the like may be described in a sequential order, suchprocesses, methods and algorithms may generally be configured to work inalternate orders, unless specifically stated to the contrary. In otherwords, any sequence or order of steps that may be described in thispatent application does not, in and of itself, indicate a requirementthat the steps be performed in that order. The steps of describedprocesses may be performed in any order practical.

Further, some steps may be performed simultaneously despite beingdescribed or implied as occurring sequentially (e.g., because one stepis described after the other step). Moreover, the illustration of aprocess by its depiction in a drawing does not imply that theillustrated process is exclusive of other variations and modificationsthereto, does not imply that the illustrated process or any of its stepsare necessary to one or more of the invention(s), and does not implythat the illustrated process is preferred. Also, steps are generallydescribed once per embodiment, but this does not mean they must occuronce, or that they may only occur once each time a process, method, oralgorithm is carried out or executed. Some steps may be omitted in someembodiments or some occurrences, or some steps may be executed more thanonce in a given embodiment or occurrence.

When a single device or article is described, it will be readilyapparent that more than one device or article may be used in place of asingle device or article. Similarly, where more than one device orarticle is described, it will be readily apparent that a single deviceor article may be used in place of the more than one device or article.

The functionality or the features of a device may be alternativelyembodied by one or more other devices that are not explicitly describedas having such functionality or features. Thus, other embodiments of oneor more of the inventions need not include the device itself.

Techniques and mechanisms described or referenced herein will sometimesbe described in singular form for clarity. However, it should be notedthat particular embodiments include multiple iterations of a techniqueor multiple manifestations of a mechanism unless noted otherwise.Process descriptions or blocks in figures should be understood asrepresenting modules, segments, or portions of code which include one ormore executable instructions for implementing specific logical functionsor steps in the process. Alternate implementations are included withinthe scope of embodiments of the present invention in which, for example,functions may be executed out of order from that shown or discussed,including substantially concurrently or in reverse order, depending onthe functionality involved, as would be understood by those havingordinary skill in the art.

Definitions

As used herein, “graph” is a representation of information andrelationships, where each primary unit of information makes up a “node”or “vertex” of the graph and the relationship between two nodes makes upan edge of the graph. Nodes can be further qualified by the connectionof one or more descriptors or “properties” to that node. For example,given the node “James R,” name information for a person, qualifyingproperties might be “183 cm tall”, “DOB Aug. 13, 1965” and “speaksEnglish”. Similar to the use of properties to further describe theinformation in a node, a relationship between two nodes that forms anedge can be qualified using a “label”. Thus, given a second node “ThomasG,” an edge between “James R” and “Thomas G” that indicates that the twopeople know each other might be labeled “knows.” When graph theorynotation (Graph=(Vertices, Edges)) is applied this situation, the set ofnodes are used as one parameter of the ordered pair, V and the set of 2element edge endpoints are used as the second parameter of the orderedpair, E. When the order of the edge endpoints within the pairs of E isnot significant, for example, the edge James R, Thomas G is equivalentto Thomas G, James R, the graph is designated as “undirected.” Undercircumstances when a relationship flows from one node to another in onedirection, for example James R is “taller” than Thomas G, the order ofthe endpoints is significant. Graphs with such edges are designated as“directed.” In the distributed computational graph system,transformations within transformation pipeline are represented asdirected graph with each transformation comprising a node and the outputmessages between transformations comprising edges. Distributedcomputational graph stipulates the potential use of non-lineartransformation pipelines which are programmatically linearized. Suchlinearization can result in exponential growth of resource consumption.The most sensible approach to overcome possibility is to introduce newtransformation pipelines just as they are needed, creating only thosethat are ready to compute. Such method results in transformation graphswhich are highly variable in size and node, edge composition as thesystem processes data streams. Those familiar with the art will realizethat transformation graph may assume many shapes and sizes with a vasttopography of edge relationships. The examples given were chosen forillustrative purposes only and represent a small number of the simplestof possibilities. These examples should not be taken to define thepossible graphs expected as part of operation of the invention

As used herein, “transformation” is a function performed on zero or morestreams of input data which results in a single stream of output whichmay or may not then be used as input for another transformation.Transformations may comprise any combination of machine, human ormachine-human interactions Transformations need not change data thatenters them, one example of this type of transformation would be astorage transformation which would receive input and then act as a queuefor that data for subsequent transformations. As implied above, aspecific transformation may generate output data in the absence of inputdata. A time stamp serves as an example. In the invention,transformations are placed into pipelines such that the output of onetransformation may serve as an input for another. These pipelines canconsist of two or more transformations with the number oftransformations limited only by the resources of the system.Historically, transformation pipelines have been linear with eachtransformation in the pipeline receiving input from one antecedent andproviding output to one subsequent with no branching or iteration. Otherpipeline configurations are possible. The invention is designed topermit several of these configurations including, but not limited to:linear, afferent branch, efferent branch and cyclical.

A “database” or “data storage subsystem” (these terms may be consideredsubstantially synonymous), as used herein, is a system adapted for thelong-term storage, indexing, and retrieval of data, the retrievaltypically being via some sort of querying interface or language.“Database” may be used to refer to relational database managementsystems, but should not be considered to be limited to such systems.Many alternative database or data storage system technologies have been,and indeed are being, introduced, including but not limited todistributed non-relational data storage systems such as Hadoop,column-oriented databases, in-memory databases, and the like. Whilevarious embodiments may preferentially employ one or another of thevarious data storage subsystems available (or available in the future),the invention should not be construed to be so limited, as any datastorage architecture may be used according to the embodiments.Similarly, while in some cases one or more particular data storage needsare described as being satisfied by separate components (for example, anexpanded private capital markets database and a configuration database),these descriptions refer to functional uses of data storage systems anddo not refer to their physical architecture. For instance, any group ofdata storage systems of databases referred to herein may be includedtogether in a single database management system operating on a singlemachine, or they may be included in a single database management systemoperating on a cluster of machines. Similarly, any single database (suchas an expanded private capital markets database) may be implemented on asingle machine, on a set of machines using clustering technology, onseveral machines connected by one or more messaging systems, or in amaster/slave arrangement. These examples should make clear that noparticular architectural approaches to database management is preferredaccording to the invention, and choice of data storage technology is atthe discretion of each implementer, without departing from the scope ofthe invention as claimed.

Hardware Architecture

Generally, the techniques disclosed herein may be implemented onhardware or a combination of software and hardware. For example, theymay be implemented in an operating system kernel, in a separate userprocess, in a library package bound into network applications, on aspecially constructed machine, on an application-specific integratedcircuit (ASIC), or on a network interface card.

Software/hardware hybrid implementations of at least some of theembodiments disclosed herein may be implemented on a programmablenetwork-resident machine (which should be understood to includeintermittently connected network-aware machines) selectively activatedor reconfigured by a computer program stored in memory. Such networkdevices may have multiple network interfaces that may be configured ordesigned to utilize different types of network communication protocols.A general architecture for some of these machines may be disclosedherein in order to illustrate one or more exemplary means by which agiven unit of functionality may be implemented. According to specificembodiments, at least some of the features or functionalities of thevarious embodiments disclosed herein may be implemented on one or moregeneral-purpose computers associated with one or more networks, such asfor example an end-user computer system, a client computer, a networkserver or other server system possibly networked with others in a dataprocessing center, a mobile computing device (e.g., tablet computingdevice, mobile phone, smartphone, laptop, and the like), a consumerelectronic device, a music player, or any other suitable electronicdevice, router, switch, or the like, or any combination thereof. In atleast some embodiments, at least some of the features or functionalitiesof the various embodiments disclosed herein may be implemented in one ormore virtualized computing environments (e.g., network computing clouds,virtual machines hosted on one or more physical computing machines, orthe like).

Referring now to FIG. 1, there is shown a block diagram depicting anexemplary computing device 100 suitable for implementing at least aportion of the features or functionalities disclosed herein. Computingdevice 100 may be, for example, any one of the computing machines listedin the previous paragraph, or indeed any other electronic device capableof executing software- or hardware-based instructions according to oneor more programs stored in memory. Computing device 100 may be adaptedto communicate with a plurality of other computing devices, such asclients or servers, over communications networks such as a wide areanetwork a metropolitan area network, a local area network, a wirelessnetwork, the Internet, or any other network, using known protocols forsuch communication, whether wireless or wired.

In one embodiment, computing device 100 includes one or more centralprocessing units (CPU) 102, one or more interfaces 110, and one or morebuses 106 (such as a peripheral component interconnect (PCI) bus). Whenacting under the control of appropriate software or firmware, CPU 102may be responsible for implementing specific functions associated withthe functions of a specifically configured computing device or machine.For example, in at least one embodiment, a computing device 100 may beconfigured or designed to function as a server system utilizing CPU 102,local memory 101 and/or remote memory 120, and interface(s) 110. In atleast one embodiment, CPU 102 may be caused to perform one or more ofthe different types of functions and/or operations under the control ofsoftware modules or components, which for example, may include anoperating system and any appropriate applications software, drivers, andthe like.

CPU 102 may include one or more processors 103 such as, for example, aprocessor from one of the Intel, ARM, Qualcomm, and AMD families ofmicroprocessors. In some embodiments, processors 103 may includespecially designed hardware such as application-specific integratedcircuits (ASICs), electrically erasable programmable read-only memories(EEPROMs), field-programmable gate arrays (FPGAs), and so forth, forcontrolling operations of computing device 100. In a specificembodiment, a local memory 101 (such as non-volatile random accessmemory (RAM) and/or read-only memory (ROM), including for example one ormore levels of cached memory) may also form part of CPU 102. However,there are many different ways in which memory may be coupled to system100. Memory 101 may be used for a variety of purposes such as, forexample, caching and/or storing data, programming instructions, and thelike.

As used herein, the term “processor” is not limited merely to thoseintegrated circuits referred to as a processor, a mobile processor, or amicroprocessor, but broadly refers to a microcontroller, amicrocomputer, a programmable logic controller, an application-specificintegrated circuit, and any other programmable circuit.

In one embodiment, interfaces 110 are provided as network interfacecards (NICs). Generally, NICs control the sending and receiving of datapackets over a computer network; other types of interfaces 110 may forexample support other peripherals used with computing device 100. Amongthe interfaces that may be provided are Ethernet interfaces, frame relayinterfaces, cable interfaces, DSL interfaces, token ring interfaces,graphics interfaces, and the like. In addition, various types ofinterfaces may be provided such as, for example, universal serial bus(USB), Serial, Ethernet, Firewire, PCI, parallel, radio frequency (RF),Bluetooth, near-field communications (e.g., using near-field magnetics),802.11 (WiFi), frame relay, TCP/IP, ISDN, fast Ethernet interfaces,Gigabit Ethernet interfaces, asynchronous transfer mode (ATM)interfaces, high-speed serial interface (HSSI) interfaces, Point of Sale(POS) interfaces, fiber data distributed interfaces (FDDIs), and thelike. Generally, such interfaces 110 may include ports appropriate forcommunication with appropriate media. In some cases, they may alsoinclude an independent processor and, in some instances, volatile and/ornon-volatile memory (e.g., RAM).

Although the system shown in FIG. 1 illustrates one specificarchitecture for a computing device 100 for implementing one or more ofthe inventions described herein, it is by no means the only devicearchitecture on which at least a portion of the features and techniquesdescribed herein may be implemented. For example, architectures havingone or any number of processors 103 may be used, and such processors 103may be present in a single device or distributed among any number ofdevices. In one embodiment, a single processor 103 handlescommunications as well as routing computations, while in otherembodiments a separate dedicated communications processor may beprovided. In various embodiments, different types of features orfunctionalities may be implemented in a system according to theinvention that includes a client device (such as a tablet device orsmartphone running client software) and server systems (such as a serversystem described in more detail below).

Regardless of network device configuration, the system of the presentinvention may employ one or more memories or memory modules (such as,for example, remote memory block 120 and local memory 101) configured tostore data, program instructions for the general-purpose networkoperations, or other information relating to the functionality of theembodiments described herein (or any combinations of the above). Programinstructions may control execution of or comprise an operating systemand/or one or more applications, for example. Memory 120 or memories101, 120 may also be configured to store data structures, configurationdata, encryption data, historical system operations information, or anyother specific or generic non-program information described herein.

Because such information and program instructions may be employed toimplement one or more systems or methods described herein, at least somenetwork device embodiments may include nontransitory machine-readablestorage media, which, for example, may be configured or designed tostore program instructions, state information, and the like forperforming various operations described herein. Examples of suchnontransitory machine-readable storage media include, but are notlimited to, magnetic media such as hard disks, floppy disks, andmagnetic tape; optical media such as CD-ROM disks; magneto-optical mediasuch as optical disks, and hardware devices that are speciallyconfigured to store and perform program instructions, such as read-onlymemory devices (ROM), flash memory, solid state drives, memristormemory, random access memory (RAM), and the like. Examples of programinstructions include both object code, such as may be produced by acompiler, machine code, such as may be produced by an assembler or alinker, byte code, such as may be generated by for example a Javacompiler and may be executed using a Java virtual machine or equivalent,or files containing higher level code that may be executed by thecomputer using an interpreter (for example, scripts written in Python,Perl, Ruby, Groovy, or any other scripting language).

In some embodiments, systems according to the present invention may beimplemented on a standalone computing system. Referring now to FIG. 2,there is shown a block diagram depicting a typical exemplaryarchitecture of one or more embodiments or components thereof on astandalone computing system. Computing device 200 includes processors210 that may run software that carry out one or more functions orapplications of embodiments of the invention, such as for example aclient application 230. Processors 210 may carry out computinginstructions under control of an operating system 220 such as, forexample, a version of Microsoft's Windows operating system, Apple's MacOS/X or iOS operating systems, some variety of the Linux operatingsystem, Google's Android operating system, or the like. In many cases,one or more shared services 225 may be operable in system 200, and maybe useful for providing common services to client applications 230.Services 225 may for example be Windows services, user-space commonservices in a Linux environment, or any other type of common servicearchitecture used with operating system 210. Input devices 270 may be ofany type suitable for receiving user input, including for example akeyboard, touchscreen, microphone (for example, for voice input), mouse,touchpad, trackball, or any combination thereof. Output devices 260 maybe of any type suitable for providing output to one or more users,whether remote or local to system 200, and may include for example oneor more screens for visual output, speakers, printers, or anycombination thereof. Memory 240 may be random-access memory having anystructure and architecture, for use by processors 210, for example torun software. Storage devices 250 may be any magnetic, optical,mechanical, memristor, or electrical storage device for storage of datain digital form. Examples of storage devices 250 include flash memory,magnetic hard drive, CD-ROM, and/or the like.

In some embodiments, systems of the present invention may be implementedon a distributed computing network, such as one having any number ofclients and/or servers. Referring now to FIG. 3, there is shown a blockdiagram depicting an exemplary architecture 300 for implementing atleast a portion of a system according to an embodiment of the inventionon a distributed computing network. According to the embodiment, anynumber of clients 330 may be provided. Each client 330 may run softwarefor implementing client-side portions of the present invention; clientsmay comprise a system 200 such as that illustrated in FIG. 2. Inaddition, any number of servers 320 may be provided for handlingrequests received from one or more clients 330. Clients 330 and servers320 may communicate with one another via one or more electronic networks310, which may be in various embodiments of the Internet, a wide areanetwork, a mobile telephony network, a wireless network (such as WiFi,Wimax, and so forth), or a local area network (or indeed any networktopology; the invention does not prefer any one network topology overany other). Networks 310 may be implemented using any known networkprotocols, including for example wired and/or wireless protocols.

In addition, in some embodiments, servers 320 may call external services370 when needed to obtain additional information, or to refer toadditional data concerning a particular call. Communications withexternal services 370 may take place, for example, via one or morenetworks 310. In various embodiments, external services 370 may compriseweb-enabled services or functionality related to or installed on thehardware device itself. For example, in an embodiment where clientapplications 230 are implemented on a smartphone or other electronicdevice, client applications 230 may obtain information stored in aserver system 320 in the cloud or on an external service 370 deployed onone or more of a particular enterprise's or user's premises.

In some embodiments of the invention, clients 330 or servers 320 (orboth) may make use of one or more specialized services or appliancesthat may be deployed locally or remotely across one or more networks310. For example, one or more databases 340 may be used or referred toby one or more embodiments of the invention. It should be understoodthat databases 340 may be arranged in a wide variety of architecturesand using a wide variety of data access and manipulation means. Forexample, in various embodiments one or more databases 340 may comprise arelational database system using a structured query language (SQL),while others may comprise an alternative data storage technology such as“NoSQL” (for example, Hadoop, MapReduce, BigTable, and so forth). Insome embodiments variant database architectures such as column-orienteddatabases, in-memory databases, clustered databases, distributeddatabases, key-value stores, or even flat file data repositories may beused according to the invention. It will be appreciated that anycombination of database technologies may be used as appropriate, unlessa specific database technology or a specific arrangement of componentsis specified for a particular embodiment herein. Moreover, it should beappreciated that the term “database” as used herein may refer to aphysical database machine, a cluster of machines acting as a singledatabase system, or a logical database within an overall databasemanagement system. Unless a specific meaning is specified for a givenuse of the term “database”, it should be construed to mean any of thesesenses of the word, all of which are understood as a plain meaning ofthe term “database”.

Similarly, most embodiments of the invention may make use of one or moresecurity systems 360 and configuration systems 350. Security andconfiguration management are common information technology (IT) and webfunctions, and some amount of each are generally associated with any ITor web systems. It should be understood that any configuration orsecurity subsystems may be used in conjunction with embodiments of theinvention without limitation, unless a specific security 360 orconfiguration 350 system or approach is specifically required by thedescription of any specific embodiment.

FIG. 4 shows an exemplary overview of a computer system 400 as may beused in any of the various locations throughout the system. It isexemplary of any computer that may execute code to process data. Variousmodifications and changes may be made to computer system 400 withoutdeparting from the broader scope of the system and method disclosedherein. CPU 401 is connected to bus 402, to which bus is also connectedmemory 403, nonvolatile memory 404, display 407, I/O unit 408, andnetwork interface card (NIC) 413. I/O unit 408 may, typically, beconnected to keyboard 409, pointing device 410, hard disk 412, andreal-time clock 411. NIC 413 connects to network 414, which may be theInternet or a local network, which local network may or may not haveconnections to the Internet. Also shown as part of system 400 is powersupply unit 405 connected, in this example, to ac supply 406. Not shownare batteries that could be present, and many other devices andmodifications that are well known but are not applicable to the specificnovel functions of the current system and method disclosed herein. Itshould be appreciated that some or all components illustrated may becombined, such as in various integrated applications (for example,Qualcomm or Samsung SOC-based devices), or whenever it may beappropriate to combine multiple capabilities or functions into a singlehardware device (for instance, in mobile devices such as smartphones,video game consoles, in-vehicle computer systems such as navigation ormultimedia systems in automobiles, or other integrated hardwaredevices).

In various embodiments, functionality for implementing systems ormethods of the present invention may be distributed among any number ofclient and/or server components. For example, various software modulesmay be implemented for performing various functions in connection withthe present invention, and such modules may be variously implemented torun on server and/or client components.

Conceptual Architecture

FIG. 23 is a block diagram illustrating an exemplary system architecturefor automated planning, according to a preferred embodiment. In variousembodiments, an automated planning (AP) system 2300 is capable ofhandling queue-based jobs individually or in batches as multiple-runjobs, with individual worker nodes handling tasks and providingcompleted work to a master node that evaluates and publishes results,for example using an AKKA™ cluster or similar master/worker clusteroperation. This achieves an asynchronous, “eventual agreement” datamodel wherein worker nodes need not synchronize with each otherdirectly, and completed work is reconciled at the master node; this datamodel allows asynchronous task completion and eliminates the need foradditional time and data throughput to be allocated to synchronizationtasks, enabling worker nodes to be dedicated solely to completing anassigned work task. Master and worker nodes may be distributed across anetwork, and may operate as a federated service that may be accessibleto clients from any location, providing a cloud-based AP service.

An automated planning (AP) service 2320 connects to a plurality ofdatabases 2310 a-n, such as a MDTSD for storing time-series data orvarious query-specific datastores such as (for example, including butnot limited to) a SQL or GraphStack storage, or other data storagesuitable for storing and providing data to various components of thesystem as needed. Data may be provided to AP service 2320 via a REST API2311 to enable initiation, processing, and retrieval of data by APservice 2320. AP service 2320 in turn comprises a plurality of workernodes 2321 a-n that are managed by one of a plurality of master nodes2322 a-n, which receives data from databases 2310 a-n as well as viaREST API 2323 from external inputs such as (for example) a userinteracting via a command-line interface (CLI) 2330.

Master nodes 2322 a-n provide work tasks to worker nodes 2321 a-n, andworker nodes 2321 a-n operate only to complete a given work task. Whennot actively processing a given task, a worker node reports to itscorresponding master node that it is ready for work, and when aprocessing task completes the worker node publishes the results to itsmaster node along with a report that it is again ready for new work. Inthis manner, each individual worker node operates independently to focuson a delegated work task, while master nodes delegate work and collectcompleted items. This facilitates the “eventual agreement” processingmodel, as the worker nodes do nothing to synchronize with each other(and may not have any sense of “awareness” of any other worker nodes,interacting exclusively with the master node) while the master nodereconciles completed work items as they are received, piecing togetherindividual tasks as a given work job is completed piece-by-piece by theworker nodes. Exemplary processes for performing automated planningusing this worker/master architectures are detailed below, withreference to FIGS. 24-26.

FIG. 5 is a block diagram of an exemplary architecture for a system 500for predictive analysis of very large data sets using a distributedcomputational graph (DCG). According to the embodiment, streaming inputfeeds 510 may be a variety of data sources which may include but are notlimited to the internet 511, arrays of physical sensors 512, databaseservers 513, electronic monitoring equipment 514 and direct humaninteraction 515 ranging from a relatively few numbers of participants toa large crowd sourcing campaign. Streaming data from any combinations oflisted sources and those not listed may also be expected to occur aspart of the operation of the invention as the number of streaming inputsources is not limited by the design. All incoming streaming data may bepassed through a data filter software module 520 to remove informationthat has been damaged in transit, is misconfigured, or is malformed insome way that precludes use. Many of the filter parameters may beexpected to be preset prior to operation, however, design of theinvention makes provision for the behavior of the filter software module520 to be changed as progression of analysis requires through theautomation of the system sanity and retrain software module 563 whichmay serve to optimize system operation and analysis function. The datastream may also be split into two identical sub streams at the datafilter software module 520 with one sub stream being fed into astreaming analysis pathway that includes the transformation pipelinesoftware module 561 of the distributed computational graph 560. Theother sub stream may be fed to data formalization software module 530 aspart of the batch analysis pathway. The data formalization module 530formats the data stream entering the batch analysis pathway of theinvention into data records to be stored by the input event data store540. The input event data store 540 can be a database of anyarchitectural type, but based upon the data model the data store modulewould be expected to store and retrieve, options using highlydistributed storage and map reduce query protocols, of which Hadoop isone, but not the only example, may be generally preferable to relationaldatabase schema.

Analysis of data from the input event data store may be performed by thebatch event analysis software module 550. This module may be used toanalyze the data in the input event data store for temporal informationsuch as trends, previous occurrences of the progression of a set ofevents, with outcome, the occurrence of a single specific event with allevents recorded before and after whether deemed relevant at the time ornot, and presence of a particular event with all documented possiblecausative and remedial elements, including best guess probabilityinformation. It should be recognized that while examples here focus onhaving stores of information pertaining to time, the use of theinvention is not limited to such contexts as there are other fieldswhere having a store of existing data would be critical to predictiveanalysis of streaming data 561. The search parameters used by the batchevent analysis software module 550 are preset by those conducting theanalysis at the beginning of the process, however, as the search maturesand results are gleaned from the streaming data during transformationpipeline software module 561 operation, providing the system more timelyevent progress details, the system sanity and retrain software module563 may automatically update the batch analysis parameters 550.Alternately, findings outside the system may precipitate the authors ofthe analysis to tune the batch analysis parameters administratively fromoutside the system 570, 562, 563. The real-time data analysis core 560of the invention should be considered made up of a transformationpipeline software module 561, messaging module 562 and system sanity andretrain software module 563. The messaging module 562 has connectionsfrom both the batch and the streaming data analysis pathways and servesas a conduit for operational as well as result information between thosetwo parts of the invention. The message module also receives messagesfrom those administering analyses 580. Messages aggregated by themessaging module 562 may then be sent to system sanity and retrainsoftware module 563 as appropriate. Several of the functions of thesystem sanity and retrain software module have already been disclosed.Briefly, this is software that may be used to monitor the progress ofstreaming data analysis optimizing coordination between streaming andbatch analysis pathways by modifying or “retraining” the operation ofthe data filter software module 520, data formalization software module530 and batch event analysis software module 540 and the transformationpipeline module 550 of the streaming pathway when the specifics of thesearch may change due to results produced during streaming analysis.System sanity and retrain module 563 may also monitor for data searchesor transformations that are processing slowly or may have hung and forresults that are outside established data stability boundaries so thatactions can be implemented to resolve the issue. While the system sanityand retrain software module 563 may be designed to act autonomously andemploys computer learning algorithms, according to some arrangementsstatus updates may be made by administrators or potentially directchanges to operational parameters by such, according to the embodiment.

Streaming data entering from the outside data feeds 510 through the datafilter software module 520 may be analyzed in real time within thetransformation pipeline software module 561. Within a transformationpipeline, a set of functions tailored to the analysis being run areapplied to the input data stream. According to the embodiment, functionsmay be applied in a linear, directed path or in more complexconfigurations. Functions may be modified over time during an analysisby the system sanity and retrain software module 563 and the results ofthe transformation pipeline, impacted by the results of batch analysisare then output in the format stipulated by the authors of the analysiswhich may be human readable printout, an alarm, machine readableinformation destined for another system or any of a plurality of otherforms.

FIG. 6 is a block diagram of a preferred architecture for atransformation pipeline within a system for predictive analysis of verylarge data sets using distributed computational graph 600. According tothe embodiment, streaming input from the data filter software module520, 615 serves as input to the first transformation node 620 of thetransformation pipeline. Transformation node's function is performed oninput data stream and transformed output message 625 is sent totransformation node 2 630. The progression of transformation nodes 620,630, 640, 650, 660 and associated output messages from each node 625,635, 645, 655, 665 is linear in configuration this is the simplestarrangement and, as previously noted, represents the current state ofthe art. While transformation nodes are described according to variousembodiments as uniform shape (referring to FIGS. 6-9), such uniformityis used for presentation simplicity and clarity and does not reflectnecessary operational similarity between transformations within thepipeline. It should be appreciated that one knowledgeable in the fieldwill realize that certain transformations in a pipeline may be entirelyself-contained; certain transformations may involve direct humaninteraction 630, such as selection via dial or dials, positioning ofswitch or switches, or parameters set on control display, all of whichmay change during analysis; other transformations may require externalaggregation or correlation services or may rely on remote procedurecalls to synchronous or asynchronous analysis engines as might occur insimulations among a plurality of other possibilities. Further accordingto the embodiment, individual transformation nodes in one pipeline mayrepresent function of another transformation pipeline. It should beappreciated that the node length of transformation pipelines depicted inno way confines the transformation pipelines employed by the inventionto an arbitrary maximum length 640, 650, 660 as, being distributed, thenumber of transformations would be limited by the resources madeavailable to each implementation of the invention. It should be furtherappreciated that there need be no limits on transform pipeline length.Output of the last transformation node and by extension, the transformpipeline 660 may be sent back to messaging software module 562 forpredetermined action.

FIG. 7 is a block diagram of another preferred architecture for atransformation pipeline within a system for predictive analysis of verylarge data sets using distributed computational graph 700. According tothe embodiment, streaming input from a data filter software module 520,705 serves as input to the first transformation node 710 of thetransformation pipeline. Transformation node's function is performed oninput data stream and transformed output message 715 is sent totransformation node 2 720. In this embodiment, transformation node 2 720has a second input stream 765. The specific source of this input isinconsequential to the operation of the invention and could be anothertransformation pipeline software module, a data store, humaninteraction, physical sensors, monitoring equipment for other electronicsystems or a stream from the internet as from a crowdsourcing campaign,just to name a few possibilities 760. In an alternative embodiment, asecond input stream 760 may contain a specification of data context thatis preserved from the first stream into a node 2 720, the shared datacontext between the inputs of a transformation node 720 allowing theservices or streams that send data to a node to share common meaning andenable faster or different methods of processing, including findingcorrelations or causative tendencies between data from two sources orstreams, in the case of a shared data context. It is not required that asecondary, tertiary, or further source of data 760 be functioning asinput to specifically the second node in the graph 720, and there may bea plurality of other datastreams feeding into one or several ofdifferent nodes in the graph. Functional integration of a second inputstream into one transformation node requires the two input stream eventsbe serialized. The invention performs this serialization using adecomposable transformation software module (not shown), the function ofwhich is described below, referring to FIG. 13. While transformationnodes are described according to various embodiments as uniform shape(referring to FIGS. 6-9), such uniformity is used for presentationsimplicity and clarity and does not reflect necessary operationalsimilarity between transformations within the pipeline. It should beappreciated that one knowledgeable in the field will realize thatcertain transformations in a pipeline may be entirely self-contained;certain transformations may involve direct human interaction 630, suchas selection via dial or dials, positioning of switch or switches, orparameters set on control display, all of which may change duringanalysis; other transformations may require external aggregation orcorrelation services or may rely on remote procedure calls tosynchronous or asynchronous analysis engines as might occur insimulations among a plurality of other possibilities. Further accordingto the embodiment, individual transformation nodes in one pipeline mayrepresent function of another transformation pipeline. It should beappreciated that the node length of transformation pipelines depicted inno way confines the transformation pipelines employed by the inventionto an arbitrary maximum length 710, 720, 730, 740, 750, as, beingdistributed, the number of transformations and their outputs 715, 725,735, 745, would be limited by the resources made available to eachimplementation of the invention. It should be further appreciated thatthere need be no limits on transform pipeline length. Output 755 of thelast transformation node and by extension, the transform pipeline, 750may be sent back to messaging software module 562 for pre-decidedaction.

FIG. 8 is a block diagram of another preferred architecture for atransformation pipeline within a system for predictive analysis of verylarge data sets using distributed computational graph 700. According tothe embodiment, streaming input from a data filter software module 520,805 serves as input to the first transformation node 810 of thetransformation pipeline. Transformation node's function is performed oninput data stream and transformed output message 815 is sent totransformation node 2 820. In this embodiment, transformation node 2 820sends its output stream 825, 860 to two transformation pipelines 830,840, 850; 865, 875. This allows the same data stream to undergo twodisparate, possibly completely unrelated, analyses without having toduplicate the infrastructure of the initial transform manipulations,greatly increasing the expressivity of the invention over currenttransform pipelines and facilitates greater efficiency as workloads canbe distributed across the available infrastructure without manualspecification from an end user. Functional integration of a secondoutput stream from one transformation node 820 requires that the twooutput stream events be serialized. The invention performs thisserialization using a decomposable transformation software module (notshown), the function of which is described below, referring to FIG. 14.While transformation nodes are described according to variousembodiments as uniform shape (referring to FIGS. 6-9), such uniformityis used for presentation simplicity and clarity and does not reflectnecessary operational similarity between transformations within thepipeline. It should be appreciated that one knowledgeable in the fieldwill realize that certain transformations in pipelines, which may beentirely self-contained; certain transformations may involve directhuman interaction 630, such as selection via dial or dials, positioningof switch or switches, or parameters set on control display, all ofwhich may change during analysis; other transformations may requireexternal aggregation or correlation services or may rely on remoteprocedure calls to synchronous or asynchronous analysis engines as mightoccur in simulations, among a plurality of other possibilities.

Further according to the embodiment, individual transformation nodes inone pipeline may represent function of another transformation pipeline.It should be appreciated that the node number of transformationpipelines and their outputs 815, 825, 835, 845, 855, 860, 870, 880depicted in no way confines the transformation pipelines employed by theinvention to an arbitrary maximum length 810, 820, 830, 840, 850; 865,875 as, being distributed, the number of transformations would belimited by the resources made available to each implementation of theinvention. Further according to the embodiment, there need be no limitson transform pipeline length. Output of the last transformation node andby extension, the transform pipeline 850 may be sent back to messagingsoftware module 562 for contemporary enabled action.

FIG. 9 is a block diagram of another preferred architecture for atransformation pipeline within a system for predictive analysis of verylarge data sets using distributed computational graph 700. According tothe embodiment, streaming input from a data filter software module 520,905 serves as input to the first transformation node 910 of thetransformation pipeline. Transformation node's function may be performedon an input data stream and transformed output message 915 may then besent to transformation node 2 920. Likewise, once the data stream isacted upon by transformation node 2 920, its output is sent totransformation node 3 930 using its output message 925 In thisembodiment, transformation node 3 930 sends its output stream back totransform node 1 935, 910 forming a cyclical relationship betweentransformation nodes 1 910, transformation node 2 920 and transformationnode 3 930. Upon the achievement of some gateway result, the output ofcyclical pipeline activity may be sent to downstream transformationnodes within the pipeline 940, 945. The presence of a generalizedcyclical pathway construct allows the invention to be used to solvecomplex iterative problems with large data sets involved, expandingability to rapidly retrieve conclusions for complicated issues.Functional creation of a cyclical transformation pipeline requires thateach cycle be serialized. The invention performs this serializationusing a decomposable transformation software module (not shown), thefunction of which is described below, referring to FIG. 15. Whiletransformation nodes are described according to various embodiments asuniform shape (referring to FIGS. 6-9), such uniformity is used forpresentation simplicity and clarity and does not reflect necessaryoperational similarity between transformations within the pipeline. Itshould be appreciated that one knowledgeable in the field willappreciate that certain transformations in pipelines, may be entirelyself-contained; certain transformations may involve direct humaninteraction 630, such as selection via dial or dials, positioning ofswitch or switches, or parameters set on control display, all of whichmay change during analysis; still other transformations may requireexternal aggregation or correlation services or may rely on remoteprocedure calls to synchronous or asynchronous analysis engines as mightoccur in simulations, among a plurality of other possibilities. Furtheraccording to the embodiment, individual transformation nodes in onepipeline may represent the cumulative function of another transformationpipeline. It should be appreciated that the node number oftransformation pipelines depicted in no way confines the transformationpipelines employed by the invention to an arbitrary maximum length 910,920, 930, 940, 950; 965, 975 as, being distributed, the number oftransformations would be limited by the resources made available to eachimplementation of the invention. It should be further appreciated thatthere need be no limits on transform pipeline length. Output of the lasttransformation node 960 and by extension, the transform pipeline 955 maybe sent back to messaging software module 562 for concomitant enabledaction.

FIG. 17 is a diagram of a computing architecture for a processing systemaccording to one aspect of the present invention. An environmentalorchestration and data processing engine 1700 permits domain experts todirectly capture their knowledge via a user interface with domainagnostic building blocks. These modular components can be built andextended by programmers to satisfy a number of use cases without a needto understand how they will be used in a specific implementation. AnEnvironmental Orchestration component 1711 and Data Processing component1712, coupled together 1701, allow for both flexibility and tightcoupling between all the actions needed to set up resources and performanalytical tasks.

The processing tasks are divided between data processing, orchestration,and system tasks. The data processing tasks provide a plug and playstyle data processing backend and orchestrates work against thatbackend. In a preferred embodiment, a data management backend 1702provides the backend processing functionality that consumes data streamsfor processing. A variety of data management and stream processingbackends may be utilized, including APACHE FLINK™, SPARK™, and APACHEBEAM™.

Data streams may use JavaScript Object Notation (JSON) as a lightweightdata-interchange format that is easy for humans to read and write, aswell as for machines to parse and generate. It is based on a subset ofthe JavaScript Programming Language, Standard ECMA-262 3rdEdition-December 1999. JSON is a text format that is completely languageindependent but uses conventions that are familiar to programmers of theC-family of languages, including C, C++, C#, Java, JavaScript, Perl,Python, and many others.

For example, a data management backend 1702 is a framework anddistributed processing engine for stateful computations over unboundedand bounded data streams. a data management backend 1702 has beendesigned to run in all common cluster environments, perform computationsat in-memory speed and at any scale. a data management backend'sarchitecture may use both Process Unbounded and Bounded Data. Any kindof data is produced as a stream of events. Credit card transactions,sensor measurements, machine logs, or user interactions on a website ormobile application, all of these data are generated as a stream. Datacan be processed as unbounded or bounded streams.

Unbounded streams have a start but no defined end. They do not terminateand provide data as it is generated. Unbounded streams must becontinuously processed, i.e., events must be promptly handled after theyhave been ingested. It is not possible to wait for all input data toarrive because the input is unbounded and will not be complete at anypoint in time. Processing unbounded data often requires that events areingested in a specific order, such as the order in which eventsoccurred, to be able to reason about result completeness.

Bounded streams have a defined start and end. Bounded streams can beprocessed by ingesting all data before performing any computations.Ordered ingestion is not required to process bounded streams because abounded data set can always be sorted. Processing of bounded streams isalso known as batch processing. Pipelines and stages herein may utilizeboth types of data.

Orchestration tasks directly handle serializing the Pipeline and Stages,monitoring of active Pipelines and submission of new Pipelines, as wellas making requests to 3rd parties for resources to be allocated asneeded. These resources may be provided within a single system, acollections of interconnected processing systems operating togetherwithin a data centers, and cloud based resources provided by partiesover the internet such as Amazon Web Services and Microsoft Azure. Allsimilar could computing services may be used to provide all or part of apipeline's stages as needed with data being transferred by addressingthe particular resources by its IP address.

System tasks include monitoring, metadata and recovery tasks to providehooks between a pipeline and the controlling system 1703 itself toenable it to monitor, pull metadata about multiple pipelines running insequence and facilitate recovery when pipelines fail, or services thatfail. These tasks are needed because the controlling system 1703 doesnot possess a direct feed into the data as it is being processed.

While APACHE FLINK™ is one of many streaming data processing engines, itshould be recognized that APIs used to construct the states typicallyprovide functionality that is extensible enough to utilize otherprocessing engines of streaming data such as SPARK™, APACHE BEAM™, andsimilar stream data processing engines. Additionally, data sinks anddata sources may occur any place in the directed graph. Each data sinkand data source maybe specified by a declarative formalism embodimentwithin a workflow such that an entire orchestration workflow may beexpressed within the overall workflow.

This architecture permits the various stages in a workflow to bemodularly constructed in which each stage is separately implementedusing a declarative definition of a streaming analytics processingworkflow. As long as a stage accepts and consumes and then generates andproduces a data stream in a common format, any implementation of aparticular stage may be used.

FIG. 18 is a diagram of a computing pipeline architecture for aprocessing system according to one aspect of the present invention. Theunit by which this is measured is a Pipeline 1800. A pipeline 1800represents a use case and is the high level application. Differentpipelines, as well as components within a pipeline, can work in tandem,allowing for even larger logical applications to be made. Pipelines areconstructed of more primitive types called stages. For example, apipeline shown in FIG. 18 illustrates a sequence of stages running inparallel. All incoming state is obtained by a source stage, stage 11801. This data is provided to three separate sequences of stages, stage3-4 1811-1812, stage 2 1802, and stages 5-7 1821-1823. Each of thesestages may be processing and sink stages as three sets of results aregenerated.

A pipeline 1800 is defined as a computing structure for housing for allthe Stages used to construct the pipeline, where the pipeline of stagesis represented as a DG (Directed Graph). This has three basic states,running, suspended, deleted. The difference between suspended anddeleted is that the suspended state stops processing but doesn't triggerthe post conditions, while deleted stops the processing and triggers thepost conditions. Pipelines are comprised of four types of Stages.

Pipeline 1800 may also be constructed using cyclic workflows of stages3-4 1811-1812 and stages 15-16 1831-1832. These cyclic workflows may becreated using the same messaging fabric in a source/sink used to defineall other workflows. This arrangement makes the expressive capability ofthis streaming analytics engine a full directed graph rather than merelydirected acyclic graphs of competing formalisms. One possible example ofa cycle would be to have a source stage that consumes from a Kafka topicwhile a separate sink stages passes messages to the same Kafka topic,thus creating a cycle.

An alternative arrangement and use of the workflow stages is tofunctionally decompose the workflow stages, and allow them to beembedded in other workflows as single stages, for instance having aworkflow with steps A, B, C, and D, embedding another 3-step workflowwith steps E, F, and G, inbetween steps B and C, such that the firstworkflow of processing data is now comprised of steps A, B, (E, F, G),C, and D. This modularity and functional decomposition of data workflowscomprises a possible alternative arrangement of the disclosed system,but is not limiting or the only alternative arrangement that may bepossible.

Environmental Conditions correspond parameter and processing conditionsa stage is going to need exist to be able to run in processingcomponents. This also includes the reverse process. These are known asthe Setup and Teardown Phase. These Conditions are defined by the Stageitself. Environment Stages are a specialized type of stage that containsonly these post-conditions and pre-conditions.

Stages a simple processing task before the processing of a particularset of data is passed to another stage to perform a next step in theprocess. This architecture provides separate units of work that may bearranged conceptually for users of this system. This architecture alsoprovides a mechanism for a level of abstraction for the operationsperformed by every stage, such as health metrics and alerting. Stagescome in three basic flavors: source stage, transformation stage, andsink stages. A source stage controls how a pipeline getting its data,including its source location, format, and similar conditions. Atransformation stage performs operations to manipulating the datareceived by the pipeline from a source stage. A sink stages controlswhere any resulting data is stored following its processing through apipeline, which also includes its location, format, and similarconditions. Additionally, environment stages may also be part of apipeline. These stages define and manipulate operating conditionsdefined above as environmental conditions.

A stage, such as stage 6 1822, within pipeline 1800 may itself beconstructed using a workflow defined in exactly the same way. Dataenters stage 15 1831 as a data stream and exits as a data stream inwhich the number of processing steps implemented as a separately definedworkflow pipeline used as modular stage element. Downstream stages, suchas stage 7 1823, does not know whether the data it receives is from aself-contained implementation of stage 6 1822, or from an embeddedworkflow such as from stage 17 1833. Hierarchical arrangements ofworkflows in such a manner permits construction of complex workflow froma combination of less complex workflows. All of this configuration ofworkflows may be defined in the declarative form described herein, andmay use stages implemented in different backend processing engines suchas Flink, Spark, Beam or similar data streaming processing technologies.

In order to support such modular functionality, workflow pipeline 1800utilizes a common data context permitting easy data exchange andintegration of stages implemented in the various processing engineswithout complication. As noted above, use of a common data exchangeformat, such as JSON, will assist this modularity. Also, data may bespecified using a common set of terms to permit ease ofinteroperability. A simple example would be to transform all incomingdata streams into a standard set of values. For example, data such asdistance, temperature, and time (zone) may be provided in various units.By transforming the data into a common set of units, all workflows mayinteroperate without issue. Data may be retransformed into a set ofunits useful by a user once the processing is otherwise completed.

It is possible to use the disclosed system, for instance, for thepurposes of Complex Event Processing (CEP), which entails real-timeprocessing of event datastreams, through the use of workflow pipelinesto extract and analyze important data from a datastream to determinecharacteristics about an event.

FIG. 19 is a diagram of a computing operating states 1900 for a pipelineaccording to one aspect of the present invention. A pipeline 1800, andits component stages, will operate in one of a set of possible operatingstates. The pipeline begins in an idle state 1901 once it has beenloaded into computing resources. No data processing operations occur inthis state. The pipeline next enters a started state 1902 when thepipeline is launched. From here, the pipeline can transition to arunning state 1903 to go to a stopped state 1906. Data is processedthrough each of the stages in the pipeline while in the running state1903.

When a set of data has been completely processed, the pipeline can go toa paused state 1904 or a stopped state. In both cases, data processingis halted. From a paused state 1904, the data processing may resume fromits last point in the data by restarting the pipeline to return it to arunning state 1903.

From a stopped state 1906, the pipeline may enter a deleted state 1907when its stages and computing components are removed from the computingresources. The pipeline enters an updated state 1905 either when changesare made to the existing graph defining the data flow within thepipeline or when a base docker image used to create the pipeline changesthat requires changing to existing pipelines. The stopped pipeline isreconfigured in the update state to permit the new definition for thepipeline to operate on data when the pipeline returns to a running state1903 from the update state 1905.

FIG. 21 is a system diagram detailing the components of a ProductionRule System (PRS), according to an embodiment. A client 2105 computerconnects to a production rule system (PRS) 2110, via a REST API 2111over a network. A PRS is a rule system which enables many differentfunctionalities, including making external function calls todomain-specific oracles, providing for generalization of semantic anddatastream processing rules and preventing rule creep when definingmultiple transitivity properties, allowing for scalar value comparisonsof data (comparing ages, distances, etc.), allowing for aggregation offacts and rules from different knowledge bases, graphs, or both,allowing for JSON conversion of rules to and from a GraphStack with auniversally unique identifier (UUID), providing the ability toinstantiate nodes with specified properties in a GraphStack, provide fora message queue through a Command-Line Interface (CLI) or Graphical UserInterface (GUI), and rule building through an API, and allowing for newrules and modified rules to be updated with a real-time visualization. AREST API 2111 provides the forward-facing access to PRS 2110functionality for a client 2105, the PRS further comprising a set ofcore components 2120 which operate a further set of construction andevaluation protocols 2130. The construction and evaluation protocolsinclude a data parser 2131, data evaluator 2132, and data constructor2133. The remaining core components 2120 include at least an engine 2121which drives the overall system and receives semantic data from a dataconstruction component 2133, forwarding processed data to a factregistry interface 2122 and an PRS client 2112. A fact registryinterface 2122 may register new data selectively or automatically with aknowledge base 2140 which includes a directed knowledge graph and amultidimensional time series database (MDTSDB), and communicateregistered data and the result of attempts to register new data with thePRS engine 2121. An PRS engine 2121 operates the construction andevaluation protocols 2130 to parse data sent through the REST API 2111,and sends results and further queries for backend oracles 2150 to a PRSclient 2112. An PRS client 2112 represents the PRS system 2110communicating with backend oracles 2150 which in turn send the resultsof these modified queries to the client 2105, thus completing the cycleand allowing the rule system 2110 to act as a modular, integrablefront-end to other systems for semantic data and API call processing. Asan example of a type of rule that might be created by the PRS, the PRSmay declaratively specify windowed rules, wherein rules may beestablished for events occurring within a given time window. Forexample, a windowed rule may be established that counts the number oflogin attempts made within a two-minute time window. The window may be a“tumbled” or “sliding” window that repeatedly refreshes on a periodicbasis to apply the rule to the time window just prior to the refresh.

It is possible to use the disclosed system, for instance, for thepurposes of Complex Event Processing (CEP), which entails real-timeprocessing of event datastreams, through the use of workflow pipelinesto extract and analyze important data from a datastream to determinecharacteristics about an event.

Exactly-once semantics settings may be preserved according to someembodiments when registering a new fact or datapoint 2122 in a knowledgebase 2140, such that appearance of one semantically similar or identicaldatapoint in future processed data may achieve idempotency and cause aneffect in the system only the first time it is encountered, but notsubsequent times, such as when certain forms of machines have an “ON”and “OFF” switch respectively, wherein the “ON” switch does not performany other actions after being pressed an initial time, until the deviceis turned “OFF.” For instance, an event datastream may be processed withsemantic learning and examination that contains reference to atemperature of 72 degrees in a specific geographical area. If that sameinformation is processed again, with exactly-once semantics enabled forthis datapoint, then subsequent occurrences of the same area having 72degrees of temperature will not cause a change in the system or a newevent to be catalogued, until the temperature in that area changes tosomething other than 72, such as 71, at which point the temperatureshifting back to 72 will constitute a logged event. In otherembodiments, the idempotency may mean that even after a change from theexactly-once occurrence, the occurrence will not trigger a new event.

The oracles 2150 may comprise any plurality or combination of servicesand technologies and components, which are utilized for database storageand data stream processing, which the PRS 2110 may communicate with tohelp with backend processing. According to an embodiment, a database maybe included either in the oracles 2150 backend or in the knowledge base2140, or both, to support the integration of fixed-point rule semantics,providing for analysis of data and semantic data especially bycomparison to a fixed point after refinement using machine learning.

FIG. 22 is a system diagram illustrating cyclic workflow stages in apipeline of data analysis, according to an embodiment. A client 2210system sends a query or batch of data for processing to one of fourpossible workflow stages, either an environmental stage 2220, a sourcestage 2230, a transformation stage 2240, or a sink stage 2250. Allworkflow stages may feed into other workflow stages as shown bydirectional arrows, or at any point may forward the data from processingin the specified workflow stage back to the client for viewing, withoutforwarding to another workflow stage. Notably, an environmental datastage 2220 is the only workflow stage capable of transmitting dataas-needed between all three of the other workflow stages. Anenvironmental workflow stage is utilized when environmental variables,settings, and initializations must be set, for instance initializationof other workflow stages, or of knowledge graph nodes, or otherenvironmental attributes of interest. A source stage of workflow 2220 iswhere data is analyzed to determine, broadly speaking, the origin andacquisition of the data, before either returning the result of theworkflow immediately to the client 2210 or continuing to atransformation stage 2240. A transformation stage of workflow 2240 iswhere data may be manipulated, and represents such workflow steps andfunctionality as starting a data pipeline, shutting down a datapipeline, and editing a data pipeline for the flow and processing ofdata as required. This stage of the workflow may return to the client2210 or continue on to a data sink stage 2250, which includes functionsregarding where to put or send data after processing, or where to senddata as received directly from a client 2210. The workflow diagram asshown illustrates a cyclical nature wherein data and operations can beaccomplished in one of several workflow stages, forwarded either toanother workflow stage or returned back to the client, and repeated,until a client no longer desires to operate according to the definedworkflow.

A novel, declarative domain-specific language (DSL) may be utilized inthe workflow cycle. According to a preferred embodiment, severalfunctions of a novel DSL may be utilized, including a capability forbidirectional dependencies on operations (for instance, “A->B” may beused to specify B depending on A before executing, or “B<-A” for thesame), channel or domain-specific directional dependencies (forinstance, “A->(“EXAMPLE”,B)” may be interpreted as B has a dependency onA's EXAMPLE signal, channel, or argument), multi-argument support (forinstance, “A->(set(“EXAMPLE”,“EXAMPLE 2”),B)), and may be modular, fornew language definitions and uses to be defined as needed.

Description of Method Embodiments

FIG. 24 is a flow diagram illustrating an exemplary overview of aprocess for automated planning, according to a preferred embodiment.According to the embodiment, a distributed computational graph (DCG)2410 receives system observations from a system observation engine 2420.These system state observations are used by the DCG to determine a setof execution instructions for a planning task, which are provided to anAP service 2320 for execution. AP service 2320 retrieves an initialstate 2430 and set of objectives 2440 from storage 2310 a-n, to form theboundary conditions for the work task (that is, what the system statelooks like at the beginning of execution, and what it needs to look likewhen execution has completed). Additional input may also be received orcollected from external sources 2450, such as user input or onlinecontent retrieved through RESTful APIs. The execution instructions arethen broken up into discrete work tasks by master nodes within the APservice 2320 (as described above, with reference to FIG. 23) and worktasks are then assigned to worker nodes operating within the AP service(as described above in FIG. 23). When all work tasks have concluded, theAP service produces a set of execution plans and policies that are thenprovided to the DCG 2410, which in turn produces a set of actions to betaken based on the plans and policies. In this manner, a concrete actionplan is produced from state observations and machine learning throughthe use of DCG pipelines and parallel work processing using the APservice worker nodes, automating the planning process and producing aclear path to reach a desired end state.

FIG. 25 is a flow diagram illustrating an exemplary process for asingle-run AP job. According to a single-run AP job process, when taskedwith a set of execution instructions an AP service 2320 first validatesall input data 2501. This includes (but is not limited to) validatingknown initial information about the world and system (these describeenvironmental conditions and context in which the job is being run),objectives for the work job, available resources, constraints on jobexecution, and initial state expectation (that is, what the systemanticipates the starting state to look like). AP service 2320 thenconstructs a plurality of component models and a planning instance 2502,which serve as a simulation model for developing work tasks within thecontext of the specific job. Work task execution then comprises amulti-step operation 2510 utilizing worker nodes to process discretetasks in parallel, beginning with seeding the newly-constructed planninginstance with parameterized models 2511; in other words, the instance ispopulated with simulated models of various factors such as (for example,including but not limited to) environmental factors or job constraintsas described above, based on the component models generated by themaster nodes (which are provided to worker nodes as needed by theirrespective master nodes). Execution is then initialized 2512, and masternodes assign individual work tasks to worker nodes for processing.Results of work task execution are collected and analyzed by masternodes 2513, and any uncertainty estimation is performed 2514 asappropriate. The final results, including any uncertainty values toplace the results in the proper context for decision making, are thenprovided to the requestor 2515.

FIG. 26 is a block diagram illustrating an exemplary process for amultiple-run AP job. In a multiple-run job, execution does not concludeimmediately when a single AP processing job is complete; instead,additional runs are executed iteratively until a plurality of endconditions (such as a timer, a specified number of runs, or a specificdesired outcome is generated) are met. In a multiple-run job, afterinitial validation 2601 a plurality of end conditions are specified2602. Initial parameters and planning instance are then constructed2603, and execution of the simulation by worker nodes then proceeds2510. As results are collected 2604, analysis now includes determiningif end conditions have been met 2604. When the required end conditionsare met, such as a specified number of runs, master nodes are instructedto stop assigning new work tasks to worker nodes; this effectively haltsthe execution once all pending tasks have completed, without needing tosend multiple stop instructions or synchronize activity between nodes.If a worker node is not assigned work, it simply waits for the next taskto be assigned; worker nodes need not have any awareness of the state ofan overall AP job, and simply carry out individual tasks as they areassigned. When execution has concluded, results are published to therequestor 2605.

FIG. 10 is a process flow diagram of a method 1000 for predictiveanalysis of very large data sets using the distributed computationalgraph. One or more streams of data from a plurality of sources, whichincludes, but is in no way not limited to, a number of physical sensors,web based questionnaires and surveys, monitoring of electronicinfrastructure, crowd sourcing campaigns, and direct human interaction,may be received by system 1001. The received stream is filtered 1002 toexclude data that has been corrupted, data that is incomplete ormisconfigured and therefore unusable, data that may be intact butnonsensical within the context of the analyses being run, as well as aplurality of predetermined analysis related and unrelated criteria setby the authors. Filtered data may be split into two identical streams atthis point (second stream not depicted for simplicity), wherein one substream may be sent for batch processing 1600 while another sub streammay be formalized 1003 for transformation pipeline analysis 1004, 561,600, 700, 800, 900. Data formalization for transformation pipelineanalysis acts to reformat the stream data for optimal, reliable useduring analysis. Reformatting might entail, but is not limited to:setting data field order, standardizing measurement units if choices aregiven, splitting complex information into multiple simpler fields, andstripping unwanted characters, again, just to name a few simpleexamples. The formalized data stream may be subjected to one or moretransformations. Each transformation acts as a function on the data andmay or may not change the data. Within the invention, transformationsworking on the same data stream where the output of one transformationacts as the input to the next are represented as transformationpipelines. While the great majority of transformations in transformationpipelines receive a single stream of input, modify the data within thestream in some way and then pass the modified data as output to the nexttransformation in the pipeline, the invention does not require thesecharacteristics. According to the embodiment, individual transformationscan receive input of expected form from more than one source 1300 orreceive no input at all as would a transformation acting as a timestamp.According to the embodiment, individual transformations, may not modifythe data as would be encountered with a data store acting as a queue fordownstream transformations 1303, 1305, 1405, 1407, 1505.

According to the embodiment, individual transformations may provideoutput to more than one downstream transformation 1400. This abilitylends itself to simulations where multiple possible choices might bemade at a single step of a procedure all of which need to be analyzed.While only a single, simple use case has been offered for each example,in each case, that example was chosen for simplicity of description froma plurality of possibilities, the examples given should not beconsidered to limit the invention to only simplistic applications. Last,according to the invention, transformations in a transformation pipelinebackbone may form a linear, a quasi-linear arrangement or may becyclical 1500, where the output of one of the internal transformationsserves as the input of one of its antecedents allowing recursiveanalysis to be run. The result of transformation pipeline analysis maythen be modified by results from batch analysis 1005 of the data stream1600 and output 1006 in format predesigned by the authors of theanalysis with could be human readable summary printout, human readableinstruction printout, human-readable raw printout, data store, ormachine encoded information of any format that may be used in furtherautomated analysis or action schema.

FIG. 11 is a process flow diagram of a method 1100 for an embodiment ofmodeling the transformation pipeline module 561 of the invention as adirected graph using graph theory. According to the embodiment, theindividual transformations 1102, 1104, 1106 of the transformationpipeline t₁ . . . t_(n) such that each t_(i) T are represented as graphnodes. Transformations belonging to T are discrete transformations overindividual datasets d_(i), consistent with classical functions. As such,each individual transformation t_(j), receives a set of inputs andproduces a single output. The input of an individual transformationt_(i), is defined with the function in:t_(i) d₁ . . . d_(k) such thatin(t_(i))={d₁ . . . d_(k)) and describes a transformation with k inputs.Similarly, the output of an individual transformation is defined as thefunction out: t_(i) [ld₁] to describe transformations that produce asingle output (usable by other transformations). A dependency functioncan now be defined such that dep(t_(a),t_(b)) out(t_(a))in(t_(b)) Themessages carrying the data stream through the transformation pipeline1101, 1103, 1105 make up the graph edges. Using the above definitions,then, a transformation pipeline within the invention can be defined asG=(V,E) where message(t₁,t₂ . . . t_((n−1)),t_(n))V and alltransformations t₁,t_(n) and all dependencies dep(t_(i),t_(j))E 1107.

FIG. 12 is a process flow diagram of a method 1200 for one embodiment ofa linear transformation pipeline 1201. This is the simplest ofconfigurations as the input stream is acted upon by the firsttransformation node 1202 and the remainder of the transformations withinthe pipeline are then performed sequentially 1202, 1203, 1204, 1205 forthe entire pipeline with no introduction of new data internal to theinitial node or splitting output stream prior to last node of thepipeline 1205. The result of the transformation pipeline is then sentback out to any message and output processes 1206. This configuration isthe current state of the art for transformation pipelines and is themost general form of these constructs. Linear transformation pipelinesrequire no special manipulation to simplify the data pathway and arethus referred to as non-decomposable. The example depicted in thisdiagram was chosen to convey the configuration of a lineartransformation pipeline and is the simplest form of the configurationfelt to show the point. It in no way implies limitation of theinvention.

FIG. 13 is a process flow diagram of a method 1300 for one embodiment ofa transformation pipeline where one transformation node 1307 in atransformation pipeline receives data streams from two sourcetransformation nodes 1301. The invention handles this transformationpipeline configuration by decomposing or serializing the input events1302-1303, 1304-1305 heavily relying on post transformation functioncontinuation. The results of individual transformation nodes 1302, 1304just antecedent to the destination transformation node 1306 and placedinto a single specialized data storage transformation node 1303, 1305(shown twice as process occurs twice). The combined results thenretrieved from the data store 1306 and serve as the input stream for thetransformation node within the transformation pipeline backbone 1307,1308. The example depicted in this diagram was chosen to convey theconfiguration of transformation pipelines with individual transformationnodes that receive input from two source nodes 1302, 1304 and is thesimplest form of the configuration felt to show the point. It in no wayimplies limitation of the invention. Any number of permutations andtopologies possible, especially as the invention places no designrestrictions on the number of transformation nodes receiving input fromgreater than one sources or the number sources providing input to adestination node.

FIG. 14 is a process flow diagram of a method 1400 for one embodiment ofa transformation pipeline where one transformation node 1402 sendsoutput to a second node 1403 in a transformation pipeline, which thenmay send output data stream to two destination transformation nodes1401, 1406, 1408 in potentially two separate transformation pipelines.The invention handles this transformation pipeline configuration bydecomposing or serializing the output events 1404, 1405-1406, 1407-1408.The results of the source transformation node 1403 just antecedent tothe destination transformation nodes 1406 and placed into a singlespecialized data storage transformation node 1404, 1405, 1407 (shownthree times as storage occurs and retrieval occurs twice). The resultsof the antecedent transformation node may then be retrieved from a datastore 1404 and serves as the input stream for the transformation nodestwo downstream transformation pipeline 1406, 1408. The example depictedin this diagram was chosen to convey the configuration of transformationpipelines with individual transformation nodes that send output streamsto two destination nodes 1406, 1408 and is the simplest form of theconfiguration felt to show the point. It in no way implies limitation ofthe invention. Any number of permutations and topologies possible,especially as the invention places no design restrictions on the numberof transformation nodes sending output to greater than one destinationor the number destinations receiving input from a source node.

FIG. 15 is a process flow diagram of a method 1500 for one embodiment ofa transformation pipeline where the topology of all or part of thepipeline is cyclical 1501. In this configuration the output stream ofone transformation node 1504 acts as an input of an antecedenttransformation node within the pipeline 1502 serialization ordecomposition linearizes this cyclical configuration by completing thetransformation of all of the nodes that make up a single cycle 1502,1503, 1504 and then storing the result of that cycle in a data store1505. That result of a cycle is then reintroduced to the transformationpipeline as input to the first transformation node of the cycle 1506. Asthis configuration is by nature recursive, special programming to unfoldthe recursions was developed for the invention to accommodate it. Theexample depicted in this diagram was chosen to convey the configurationof transformation pipelines with individual transformation nodes thatfor a cyclical configuration 1501, 1502, 1503, 1504 and is the simplestform of the configuration felt to show the point. It in no way implieslimitation of the invention. Any number of permutations and topologiespossible, especially as the invention places no design restrictions onthe number of transformation nodes participating in a cycle nor thenumber of cycles in a transformation pipeline.

FIG. 16 is a process flow diagram of a method 1600 for one embodiment ofthe batch data stream analysis pathway which forms part of the inventionand allows streaming data to be interpreted with historic context. Oneor more streams of data from a plurality of sources, which includes, butis in no way not limited to, a number of physical sensors, web basedquestionnaires and surveys, monitoring of electronic infrastructure,crowd sourcing campaigns, and direct human interaction, is received bythe system 1601. The received stream may be filtered 1602 to excludedata that has been corrupted, data that is incomplete or misconfiguredand therefore unusable, data that may be intact but nonsensical withinthe context of the analyses being run, as well as a plurality ofpredetermined analysis related and unrelated criteria set by theauthors. Data formalization 1603 for batch analysis acts to reformat thestream data for optimal, reliable use during analysis. Reformattingmight entail, but is not limited to: setting data field order,standardizing measurement units if choices are given, splitting complexinformation into multiple simpler fields, and stripping unwantedcharacters, again, just to name a few simple examples. The filtered andformalized stream is then added to a distributed data store 1604 due tothe vast amount of information accrued over time. The invention has nodependency for specific data stores or data retrieval model. Duringtransformation pipeline analysis of the streaming pipeline, data storedin the batch pathway store can be used to track changes in specifics ofthe data important to the ongoing analysis over time, repetitive datasets significant to the analysis or the occurrence of critical points ofdata 1605. The functions of individual transformation nodes 620 may besaved and can be edited also all nodes of a transformation pipeline 600keep a summary or summarized view (analogous to a network routing table)of applicable parts of the overall route of the pipeline along withdetailed information pertaining to adjacent two nodes. This frameworkinformation enables steps to be taken and notifications to be passed ifindividual transformation nodes 640 within a transformation pipeline 600become unresponsive during analysis operations. Combinations of resultsfrom the batch pathway, partial and streaming output results from thetransformation pipeline, administrative directives from the authors ofthe analysis as well as operational status messages from components ofthe distributed computational graph are used to perform system sanitychecks and retraining of one or more of the modules of the system 1606.These corrections are designed to occur without administrativeintervention under all but the most extreme of circumstances with deeplearning capabilities present as part of the system manager and retrainmodule 563 responsible for this task.

FIG. 20A-20D is a process flow diagram for a set of processingoperations used in a pipeline processing system according to one aspectof the present invention. The controlling system 1703 communicates withand controls the operation of a pipeline using a set of API commandsthat include Post, Get, Delete, and Put commands. FIG. 20A illustratesthe operation of the Post 2001 commands. The Post commands include anapi/pipelines post and an api/pipelines/validate commands.

The POST/api/pipelines is a command having a content type:‘application/json.’ This command is the entry point. It creates a newpipeline in the database of pipelines but does not start the pipeline.To start the pipeline, call ‘GET . . . /env’ and ‘GET . . . /data’commands described below. Invalid pipelines may be saved at this point,future calls to this pipeline will be validated as part of the operationof the command.

The command has the following payload fields: ‘pipeline;’ (required):‘stageGraphBuilder’-a JSON representation of a valid pipeline;(required): ‘version’—the system version expected. An error will occurif the manager's version is different; (optional): ‘uuid’—If none isprovided one will be created and returned in the response payload;(optional): ‘name’—A human readable name for the pipeline, uniqueness isnot enforced; (optional): ‘description’—A description for end users; and(optional): ‘tags’—Keywords or terms associated with the pipeline (thesetags are stored in an array). In operation the command receives thecommand 2011 and gets data from the data store 2012 before deciding ifthe pipeline in question exists 2013 in the database. If it doesdetermine the pipeline exists, this pipeline is rejected 2014 as alreadyexisting. If not, the data is data store is updated 2015 and ifsuccessful 2016, and a 201 response with and id==UUID is returned 2017.

The POST/api/pipelines/validate is a command having a content type:‘application/json.’ This command validates a pipeline. A pipeline withno environmental stages and no data processing stages is consideredinvalid. The command uses payload fields: ‘pipeline’ (required): See[‘POST /api/pipelines’](#post-apipipelines). An example response is:

Example Response (200 OK):

{ “pipelineId”:“038bf27f-52f0-40cf-95db-b70b83ade772”, “invalidStages”:[], “statusCode”:200 }

The command is received 2021 and the pipeline is deserialized 2022 and apipeline validation call is made 2023. A response 200 is returned with alist if invalid stages and paths are found 2024.

FIGS. 20B and 20C illustrate the operation of a set of Get 2002 and Postcommands. APACHE FLINK™ will be used throughout the figures to refer toa data management backend. The GET/api/pipelines?tag=A&tag=B’ is acommand having a content type: ‘application/json.’ This command gets thepipelines that are associated with the provided tag(s). An exampleresponse (200 OK):

{ “pipelineId”:null, “data”:[ { “name”:“pipeline1”, “description”:null,... }, { “name”:“pipeline2”, “description”:null, ... } ]“statusCode”:200 }

The GET/api/pipelines/{uuid} command is a command having a content type:‘application/json.’ The command gets the pipeline previously postedpipeline from the database. The command is received 2031 and data isobtained from the data store 2032. If the pipeline exists 2033 in thedata base, a pipeline definition is returned 2035; otherwise a reject404 pipeline not found is returned 2034.

The POST/api/pipelines/{uuid}/env/start is a command that calls theenvironmental setup for a pipeline. An example response (202 Accepted):

{ “pipelineId”:“2db14f86-29c4-4067-a7ac-e05c24035c3a”,“data”:“Environmental setup for pipeline[2db14f86-29c4-4067-a7ac-e05c24035c3a] started”, “statusCode”:202 }

The command is received 2041 and data is obtained from the data store2042. If the pipeline exists 2043 in the data base, a request acceptedis returned 2044; otherwise a reject 404 pipeline not found is returned2034.

The POST/api/pipelines/{uuid}/env/status command returns the statuses ofthe environmental stages. An example response (200 OK):

{ “pipelineId”: “51afaae4-ddce-42af-ba0a-f341075e412b”, “data”: [{“uuid”: “bc63f730-ed89-4124-bae9-c31c378802cc”, “state”: “SUCCESS” }, {“uuid”: “a36dda4e-l7fe-4afD-a745-0ea4dc3e948c”, “state”: “SUCCESS” }],“statusCode”: 200 }

The command is received 2051 and data is obtained from the data store2052. If the pipeline exists 2053 in the data base, a stage ID andstatus is obtained 2054 and returned 2055; otherwise a reject 404pipeline not found is returned 2034.

The POST/api/pipelines/{uuid}/env/stop command calls the environmentalteardown in a pipeline. If the data processing stages are still runningwhen this endpoint is called, this endpoint returns an error. In otherwords, call ‘POST . . . /data/stop’ before calling this endpoint. Anexample response (202 Accepted):

{ “pipelineId”:“038bf27f-52f0-40cf-95db-b70b83ade772”, “data”:null,“statusCode”:202 }

The command is received 2061 and data is obtained from the data store2062. If the pipeline exists 2063 in the data base, a request acceptedis returned 2064; otherwise a reject 404 pipeline not found is returned2034.

The command is received 2071 and a test if an active pipeline exists2072 in Flink. If the pipeline is not active, and already runningrejection is returned 2100; otherwise a test to determine if thepipeline exists 2073 is performed. If the pipeline is not in thedatabase a reject 404 pipeline not found is returned 2074. If thepipeline exists in the database, the pipeline is deserialized 2075. Atest to determine if the operation was a success 2076 is performed andif not, a rejection ENV is not in a proper state is returned 2077. If asuccess was detected, a request to Flink is made 2078 and a status ofthe request is tested 2079 a, If the status is good, the accepted workis returned 2079 b; otherwise a Reject Flink rejects pipeline state isreturned 2079 c.

The POST/api/pipelines/{uuid}/start/all starts both the environmentaland the data processing stages in a pipeline. Starts the pipeline fromthe most recent save point, if one exists. The command uses payloadparameters: ‘taskmanager-heap-mb’—the amount of heap to allocate to eachtask manger; ‘jobmanager-heap-mb’—the amount of heap to allocate to eachjob manager. Number of job managers is one; ‘taskmanager-slots’—thenumber of slots to allocate per taskmanager; ‘taskmanager-cpu-count’—thenumber of cpu cores to allocate per task manager;‘jobmanager-cpu-count’—the number of cpu cores to allocate to the jobmanager; ‘job-parallelism’—the number of parallel instances to run atonce; (optional) ‘job-checkpoint-timeout-seconds’—(default: 600) thenumber of seconds before checkpoints or savepoint is considered failed;(optional) ‘job-checkpoint-pause-seconds’—(default: 30) the number ofseconds to wait before starting another checkpoint after a checkpointcompletes; and (optional) ‘job-checkpoint-frequency-seconds’—(default:60) the interval in seconds by which checkpoints should occur. Thecommand returns a 200 (OK) status instead of a 202 (Accepted) becauseFlink's API returns a 200 when submitting a job. An example response(200 OK):

{ “pipelineId”:“038bf27f-52f0-40cf-95db-b70b83ade772”, “data”:null,“statusCode”:200 }

The GET/api/pipelines/{uuid}/data/status 2081 returns the status of thedata processing stages, by first determining if the pipeline exists inFlink 2082, following up with a check for the pipeline in the databaseif the pipeline does not exist in Flink 2083. If it does exist in thedatabase, a “pipeline never started” status may be returned 2084, whileif the pipeline does not exist in the database, a “404 pipeline notfound” 2074 error may be returned. If, however, the pipeline does existin Flink 2082, the Flink status of the pipeline is fetched 2085 andreturned 2086. An example response (200 OK):

{ “pipelineId”:“038bf27f-52f0-40cf-95db-b70b83ade772”, “data”:“RUNNING”,“statusCode”:200 }

The POST/api/pipelines/{uuid}/data/stop command stops the dataprocessing stages in a pipeline (i.e., calls Flink with a save point).Returns an error if the pipeline does not have data processing stages.The command uses request parameter ‘graceful’ (optional): indicateswhether to stop the pipeline with a save point. Acceptable values:‘true’, ‘false’ (defaults to ‘true’). An example response (202Accepted):

{ “pipelineId”:“038bf27f-52f0-40cf-95db-b70b83ade772”, “data”:null,“statusCode”:202 }

The POST/api/pipelines/{uuid}/stop/all stops the data processing stagesin a pipeline (i.e., calls Flink with a save point). Returns an error ifthe pipeline does not have data processing stages. The command usesrequest parameter ‘graceful’ (optional): indicates whether to stop thepipeline with a save point. Acceptable values: ‘true’, ‘false’ (defaultsto ‘true’). An example response (202 Accepted):

{ “pipelineId”:“038bf27f-52fb-40cf-95db-b70b83ade772”, “data”:null,“statusCode”:202 }

In both of the above stop commands, command is received 2091 and a test2092 determines if the pipeline exists in Flink. If is exists, a requestto Flink 2095 is made and the Flink results are returned 2096; otherwisea reject Pipeline ID is not running is returned 2094.

FIG. 20D Illustrate the operation of Delete 2003 and Put commands. TheDELETE /api/pipelines/{uuid} 2003 deletes the pipeline from thedatabase. Does not stop the pipeline, so it's expected that the usercalls ‘GET . . . /stop’ first. Calls to delete the pipeline while it isalready active will be rejected. The command is received 2301 and datais retrieved from the data store 2302. Test 2303 determines if thepipeline exists in the database. If not a Reject 404 pipeline not foundis returned; otherwise test 2305 determines if the pipeline is active inFlink. If not, a Reject 404 pipeline not found is also returned 2304;otherwise the pipeline is deserialized 2306 and a stop Env function iscalled 2307. Test 2308 determines whether the teardown was successful.If so, an update to the pipeline is made to indicate a new state 2309;otherwise remediation may be initiated 2310.

The PUT/api/pipelines command 2004 is a command having a content type:‘application/json.’ This command updates the pipeline in the database,but does not start or stop the pipeline. A pipeline with noenvironmental stages and no data processing stages is consideredinvalid. The command uses payload fields: ‘pipeline’ (required): See[‘POST/api/pipelines’](#post-apipipelines) uuid of pipeline to updatemust be in the payload. An Example response (200 OK):

{ “pipelineId”:“038bf27f-52fb-40cf-95db-b70b83ade772”, “data”:“Pipelineupdated”, “statusCode”:200 }

The command is received 2401 and data is retrieved from the data store2402. Test 2403 determines if the pipeline exists in the database. Ifnot a Reject 404 pipeline not found is returned 2404; otherwise test2405 determines if the ENV has not been started. If it has not beenstarted, a Reject cannot update pipeline not active is returned 2408;otherwise the pipeline is inserted into the database 2406 and a successindication is returned 2407.

The skilled person will be aware of a range of possible modifications ofthe various embodiments described above. Accordingly, the presentinvention is defined by the claims and their equivalents.

What is claimed is:
 1. A system for management and tracking ofcollaborative projects, comprising: an automated planning servicecomprising a memory, a processor, and a plurality of programminginstructions stored in the memory thereof and operable on the processorthereof, wherein the programmable instructions, when operating on theprocessor, cause the processor to: operate a plurality of master nodes,each master node in turn operating a plurality of worker nodes; receivea plurality of simulation conditions at a master node; construct aplurality of simulation components based on the received simulationconditions; construct a planning model based on the received simulationconditions; assign, using a master node, a plurality of discretesimulation tasks to a plurality of worker nodes, each of the pluralityof discrete simulation tasks being based on the constructed simulationcomponents and planning model, wherein each of the plurality of workernodes is assigned exactly one of the plurality of discrete simulationtasks at any given time during operation; analyze results of each of theplurality of discrete simulation tasks as they are completed; andprovide the analyzed results as output.
 2. The system of claim 1,wherein the simulation conditions are retrieved from a data storage. 3.The system of claim 1, wherein the simulation conditions are receivedvia a RESTful API.
 4. The system of claim 1, wherein the analysiscomprises determining an uncertainty value for at least a portion of theresults.
 5. The system of claim 1, wherein the analysis comprisesdetermining whether a plurality of end conditions have been met, and ifthe plurality of end conditions have been met, halting the assignment ofdiscrete simulation tasks to worker nodes.
 6. A method for managementand tracking of collaborative projects, comprising the steps of:operating, at an automated planning service, a plurality of masternodes, each master node in turn operating a plurality of worker nodes;receiving a plurality of simulation conditions at a master node;constructing a plurality of simulation components based on the receivedsimulation conditions; constructing a planning model based on thereceived simulation conditions; assigning, using a master node, aplurality of discrete simulation tasks to a plurality of worker nodes,each of the plurality of discrete simulation tasks being based on theconstructed simulation components and planning model, wherein each ofthe plurality of worker nodes is assigned exactly one of the pluralityof discrete simulation tasks at any given time during operation;analyzing results of each of the plurality of discrete simulation tasksas they are completed; and providing the analyzed results as output. 7.The method of claim 6, wherein the simulation conditions are retrievedfrom a data storage.
 8. The method of claim 6, wherein the simulationconditions are received via a RESTful API.
 9. The method of claim 6,wherein the analysis comprises determining an uncertainty value for atleast a portion of the results.
 10. The method of claim 6, wherein theanalysis comprises determining whether a plurality of end conditionshave been met, and if the plurality of end conditions have been met,halting the assignment of discrete simulation tasks to worker nodes.